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Description 

Background of the Invention 

This invention relates to recombinant plant nucleic acids and polypeptides. 

Improved means to manipulate plant gene expression is desired for a variety of industrial, agricultural, and com- 
mercial food uses. To produce new plant varieties, it is necessary to change the genetic makeup of the crop or plant in 
question. Desirable genes have to be incorporated into the crop or plant, and undesirable genes have to be eliminated 
or replaced. In other words, one needs to genetically engineer the plant to meet the demands of agriculture. Accordingly, 
genetic engineering of crop plants necessitates methods of identifying potentially valuable genes and transferring these 
to the crop that one desires to improve. 

Summary of the Invention 

We have identified and describe herein a novel plant transcriptional activator from the crucifer, Arabidopsis thaliana. 
In addition to its role as a transcriptional activator, we have also determined that this protein plays a role in plant defense 
mechanisms by interacting with proteins, e.g., 3-O-methyltransferase and ascorbate peroxidase, involved in protecting 
plants from pathogens. We named this protein AFT1 (Arabidopsis Fourteen-Three-three 1) because it shows sequence 
homology to the widespread 14-3-3 protein family. 

The AFT1 protein provides a means to enhance, control, modify or otherwise alter plant gene expression, e.g. as 
a transcription activator or as a chimeric transcriptional activator, or even to modulate events during plant cell-signaliing 
processes, e.g., signal transduction events involved in plant defense responses to pathogens such as fungi, nematodes, 
insects, bacteria, and viruses. Of special interest are the nucleic acid sequences corresponding to not only other AFT1 
proteins found in the plant kingdom, but also sequences corresponding to proteins which interact with AFT1 during plant 
signal transduction events, e.g.. those pathways which operate during a plant's response to a pathogen, for applications 
in genetic engineering, especially as related to agricultural biotechnology. 

Accordingly, in general, the invention features recombinant AFT1 polypeptides, preferably, including an amino acid 
sequence substantially identical to the amino acid sequence shown in Fig. 1 (SEQ ID N02). The invention also features 
a recombinant polypeptide which is a fragment or analog of an AFT1 polypeptide that includes a domain capable of 
activating transcription, e.g., AFT1 (34-248) or AFT1 (122-248). Transcription activation may be assayed for exarrple 
according to the methods described herein. 

In various preferred embodiments, the polypeptide is derived from a plant (e.g. . a monocot or dicotv and preferably 
from a crucifer such as Arabidopsis. 

In a second aspect, the invention features a chimeric AFT1 transcriptional activation protein including an AFT1 
polypeptide fused to a DNA-binding polypeptide. In preferred embodiments, the DNA-binding polypeptide includes, with- 
out limitation, Gal4 or LexA. 

In a third aspect, the invention features a transgenic plant containing a transgene comprising an AFT1 protein oper- 
ably linked to a constitutive (e.g., the 35S CaMV promoter) or regulated or inducible promoter (e.g., rbcS promoter). In 
other related aspects, the invention also features a transgenic plant containing a transgene containing a chimeric AFT1 
transcriptional activator protein. In related aspects, the invention features a seed and a cell from a transgenic plant 
containing the AFT1 protein, fragment or analog, or a chimeric AFT1 transcriptional activator protein. 

In a fourth aspect, the invention features a transgenic plant expressing a polypeptide of interest which involves: (a) 
a nucleic acid sequence encoding a chimeric AFT1 transcriptional activator protein; and (b) a nucleic acid sequence 
encoding a polypeptide of interest in an expressible genetic construction, wherein the binding of the chimeric protein 
— w> rW o,ui, w, it «o ^Fuijpcpuueu* iiiioioai. in fjiuitmuu «rnuuairnenrs me polypeptide or interest is, without 
limitation, a storage protein, e.g., napin, legumin, or phaseolin, or any other protein of agricultural significance. 

In a fifth aspect, the invention features substantially pure DNA (tor example, genomic DNA, cDNA, or synthetic DNA) 
encoding an AFT1 protein. Accordingly, the invention features a nucleotide sequence substantially identical to the nucle- 
otide sequence shown in Fig. 1 (SEQ ID NO: 1). In related aspects, the invention also features substantially pure DNA 
encoding a recombinant polypeptide including an amino acid sequence substantially identical to the amino acid 
sequence of AFT1 polypeptide shown in Fig. 1 (SEQ ID NO: 2). Such DNA may, if desired, be operably linked to a 
constitutive or regulated or inducible promoter as described herein. In preferred embodiments, the DNA sequence is 
from a crucifer (e.g.. Arabidopsis). In related aspects, the invention also features a vector, a cell (e.g.. a plant cell), and 
a transgenic plant or seed thereof which includes such substantially pure AFT1 DNA. In various preferred embodiments, 
the cell is a prokaryotic cell, for example. E. coli or Agrobacterium, or more preferably, a eukaryotic cell, tor example, a 
transformed plant cell derived from a cell of a transgenic plant. 

In a sixth aspect, the invention features a recombinant polypeptide which is a fragment or analog of an AFT1 polypep- 
tide(SEQ ID NO: 2) including a domain capableof interacting with a plant defense related protein. Preferably, thepolypep- 
tide is AFT1 (33-194). In related aspects, the invention also features substantially pure DNA encoding an AFT1 
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ID NO. 1). In other aspects, tfie DNA is operably linked to a constitutive or regulated or inducible promoter ^SEQ 
By cruafer" .s meant any plant that is classified within the Cruciferae family as commonly described in ' e a Grav* 
Manual of Botany American Book Company. N.Y.. 1 950; Hortus TOrd: A CorJse Dictiona^ of Ss cZS 

By "substantially identicBl" is meant a polypeptide or nucleic acid exhibiting at least 90% oraferahiv m„ m 
preferabfy 95%. and most preferably 97% homology to a reference aminoackf anucS a^Cnc? 

bL^ST*' ^ ?"E 01 C ° mpariSOn SK,UenC6S "^"^ * leas « 1 6aSac^pSab.yatleast20 
am.no acds. more preferably at least 25 amino acids, and most preferably 35 amino ackis. For nuclaTaS tnL S«f*2 
of comparison sequences will generally be at least 50 nucleotides, preferably at hast « n^Sf IT 
at least 75 nucleotides, and most preferably 110 nucleotides nucleotides, more preferably 

prote,nsand naturalfy-oocurring organic molecules wHh which r. is naturally maJ£5£££!^^ 
least 75%. more preferably at least 90%. and most preferably at least 99% bv weioht Am ,1^1?^ .^ ,. 

phSsTby^S^^ "~ deSCribed *»-^ **„lZ« e 7Z£ 

A protein is substantially free of naturally associated components when it is separated from those contaminant* 

dflerertfromthecellfrom which » naturally originates wi^ 

By "substantially pure DNA" is meant DNA that is 
12? Ji? ITJ* * hiC ^ in ,he "^roccurring genome of the oraanism from whicfi the DNA of the invention is 

SHEFZ e i ftg K- 8 CDNA ° f 8 9en0miC ° r CDNA fra9ment P'oducW by PGR or rSon erZuS 

By "transformed cell" is meant a cell into wHrh /at inin ~t ...u.^* i i_„ . . . . 

By ■promoter" is meant a DNA sequence sufficient to direct transcription; such elements may be located in the s 
capable of med.at.ng gene expression in response to a variety of developmental (e.g.. cell-specific «iJ«Kd 

ssri "^tt • " d hormonai cues inc,udin9 - but - ■«« * ssszsss 

wuni. chlorophyll a/b, or E 2 promoters described herein. 

By "operably linked" is meant that a gene and a regulatory sequence(s) (e.g.. a promoter) are connected in such a 

q.J* ^f"! fe meant ""HwiPVBno cell bounded by a semipermeable membrane and corrtaWna a olastid 
Suchace.lalsorequiresacei.waliHfurtherpropagationisdesired.P^ce.l.asuse^ 
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algae, cyanobacleria, seeds suspension cultures, embryos, meristematic regions, callus tissue, leaves, roots, shoots, 
gametophytes, sporophytes, pollen, and microspores. 

By "transgene'* is meant any piece of DNA which is inserted by artifice into a cell, and becomes part of the genome 
of the organism which develops from that cell. Such a transgene may include a gene which is partly or entirely heterol- 
ogous (i.e., foreign) to the transgenic organism, or may represent a gene homologous to an endogenous gene of the 
organism. 

By "transgenic" is meant any cell which includes a DNA sequence which is inserted by artifice into a cell and becomes 
part of the genome of the organism which develops from that ceil. As used herein, the transgenic organisms are generally 
transgenic plants and the DNA (transgene) is inserted by artifice into either the nuclear or plastidic genome. 

By "plant defense related protein" is meant any protein which is involved in the protection or resistance to plant pests 
(e.g., bacteria, insects, nematodes, fungi, and viruses). Such proteins include, without limitation, 3-O-methyltransferases, 
ascorbate peroxidases, chalcone synthases, hydroxyproline rich glycoproteins, glucanases, chitanases, and proteinase 
inhibitors. 

Other features and advantages of the invention will be apparent from the following description of the preferred 
embodiments thereof, and from the claims. 

Detailed Description 

The drawings will first be briefly described. 

Drawings 

Fig. 1 is the nucleic acid sequence (SEQ ID NO:1) and deduced amino acid sequence of Arabidopsis AFT1 (SEQ 
ID NO:2). 

Fig. 2 shows the LexA-dependent activation of LEU2 expression by AFT1; activation was monitored by the growth 
of yeast on a leucine-minus plate. The AFT1 clone in vector pJG4-5 which directs the production of AFT1 /B42 fusion 
protein was introduced into the yeast strain EGY48 where different plasmids had already been introduced. The plasmids 
which either direct production of different LexA fusion proteins or no LexA protein are pEG202 (LexA alone, a), pHM1- 
1 (LexA/Biocoid, b), pHM12 (LexA/Cdc2. c), pHM7-3 (LexA/Ftz homeo-domain), d), pAKR1-261 (LexA/AKR1-261) e) 
PAKR249-434 (LexA/AKR249-434, f), pAKR1 14-434 (LexA/AKR1 14-434, g). and pHM (no LexA. h). 

Figs. 3A and 3B are schematic representations showing transcription activation by AFT1. The effects of various 
fusion proteins were monitored by the growth of yeast in the absence of leucine and quantitaied by measuring the activity 
of the p-galactosidase. Panel (A) shows transcription activation by AFT1 and its derivatives fused to the activation domain 
B42 upon introduction into the yeast strain EGY48. This strain also contains the plasmid pEG202 which directs consti- 
tutive production of LexA protein and plasmid pSH18-34 which contains the reporter gene LexAop-LacZ. Panel (B) 
shows transcription activation by AFT1 and its derivatives fused to the LexA protein in the plasmid pEG202 upon intro- 
duction into the yeast strain EGY48 containing the plasmid pSH1B-34 only. 

Fig. 4 shows a genomic Southern blot analysis. The blot was probed with a labeled AFT1 cDNA clone. The lanes 
labeled C contain Columbia DNA and L, Landsberg DNA The restriction enzymes used are indicated above the lanes. 
The sizes of X-Hind III digested DNA fragments used as length markers are shown on the left. 

Figs. 5A, 5B and 5C show a RNA blot analysis of AFT1 expression. Panel (A) shows the developmental expression 
of AFT1. RNAs were extracted from greenhouse-grown plants; Panel (B) shows the organ-specific expression of AFT1 . 
RNAs of leaf, root and stem were extracted from plate-grown plants, and RNAs of flower and silique were extracted 
from greenhouse-grown plants. Panel (C) shows the effect of light on the expression of Lhca2 and AFT1. RNAs were 
extracted from greenhouse-grown plants. 

Fig. 6 shows the DNA sequence (SEQ ID NO: 17) of an isolated cDNA found to be an AFT1 interacting protein 
coding for ascorbate peroxidase. 

Fig. 7 shows the partial amino acid sequence (SEQ ID NO: 18) of ascorbate peroxidase deduced from the isolated 
cDNA (SEQ ID NO: 17). 

Fig. 8 shows the DNA sequence (SEQ ID NO: 19) of an isolated cDNA found to be an AFT1 interacting protein 
coding for 3-O-methyltransferase. 

Fig. 9 shows the partial amino acid sequence (SEQ ID NO: 20) of 30-methyltransferase deduced from the isolated 
cDNA (SEQ ID NO: 19). 

Fig. 10 shows the DNA sequence (SEQ ID NO: 21) of an isolated cDNA found to be an AFT1 interacting protein 
coding for an Arabidopsis ankryin repeating protein AKR* 

Fig. 1 1 shows the partial amino acid sequence (SEQ ID NO: 22) of an Arabidopsis ankryin repeating protein AKR 2 
deduced from the isolated cDNA (SEQ ID NO: 21). 

Fig. 12 shows the DNA sequence (SEQ ID NO: 23) of an isolated cDNA found to be an AFT1 interacting protein 
coding for proteasome. 
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Fig. 13 shows the partial amino acid sequence (SEQ ID NO: 24) of proteasome deduced from the isolated cDNA 
(SEQIDNO:23). 

Fig. 14 shows the DNA sequence (SEQ ID NO: 25) of an isolated cDNA found to be an AFT1 interacting protein. 
Fig. 1 5 shows the partial amino acid sequence (SEQ ID NO: 26) deduced from the isolated cDNA (SEQ ID NO: 25). 

Polvpetides According to the Invention 

Polypeptides according to the invention include the entire Arabidopsis AFT1 protein (as described in Fig. 1; SEQ 
ID No: 2). These polypeptides are used, e.g. f to manipulate plant gene expression at the transcriptional level (as dis- 
io cussed infra) or to manipulate the plant signal transduction pathway by providing plants with the potential of resisting 
pathogens such as fungi, insects, nematodes, bacteria, and viruses. Polypeptides of the invention also include any 
analog or fragment of the Arabidopsis AFT1 protein capable of activating transcription in a host plant. The efficacy of 
an AFT1 analog or fragment to activate transcription is dependent upon its ability to interact with the transcription com- 
plex: such an interaction may be readily assayed using any number of standard in vivo methods, e.g., the interaction 
is trap mechanism described infra. Similarly, the polypeptides of the invention include chimeric AFT1 transcriptional acti- 
vator proteins capable of selectively activating transcription of a specified gene. 

Specific AFT1 analogs of interest include full-length or partial (described infra) AFT1 proteins, including amino acid 
sequences which differ only by conservative amino acid substitutions, for example, substitutions of one amino acid for 
another of the same class (e.g.. valine for glycine, arginine for lysine, etc.) or by one or more non-conservative amino 
20 acid substitutions, deletions, or insertions at positions of the amino acid sequence which will not destroy AFT1 s ability 
to activate transcription (e.g., as assayed infra). 

Specific AFT1 fragments of interest include any portions of the AFT1 protein which are capable of interaction with 
an AFT1 ligand, e.g., a member of the transcriptional complex or a protein involved in plant defense mechanisms, such 
as 3-O-methyltransferase, and ascorbate peroxidase. Identification of such ligands may be readily assayed using any 
25 number of standard in vivo methods, e.g., the interaction trap mechanism described infra. 

There now follows a description of the cloning and characterization of an Arabidopsis AFT-encoding cDNA useful 
in the instant invention, and a characterization of its ability to activate transcription, and its protein interacting properties. 
This example is provided for the purpose of illustrating the invention and should not be construed as limiting. 

30 Isolation of an Arabidopsis Gene Encoding an AFT protein 

The Arabidopsis AFT1 gene was isolated as follows. 
A yeast interaction trap system (Zervos etal, Cell 72:223-232, 1993; Gyuris et al.. Cell 75:791 -803. 1993) was modified 
fa the isolation of an Arabidopsis AFT protein. The yeast strain EGY48 (MATa trp1 ura3 his3 LEU2::plexAop6-LEU2) 

35 containing a plasmid pJK103 (Zervos et al., supra) that directs expression of a Gall -lacZ gene from two high affinity 
ColE1 LexA operators, was used in the interaction trap experiment A "barT (LexA/AKR1-261, residues 1-261 of AKRP 
(Arabidopsis anKyrin repeat protein) fused to DNA binding protein LexA) was introduced into the strain and then an 
Arabidopsis cDNA expression library was introduced (see. e.g., Zhang et al., Plant Cell 4:1575-1588, 1992). Selection 
was first carried out on leucine minus plates, and Leu* colonies were analyzed on X-gal plates. The clones which activated 

40 transcription of reporter genes in the presence of, but not in the absence of, the LexA protein or its fusion derivatives 
were isolated. 

The oligo(dT)-primed activation-tagged cDNA expression library in vector pJG4-5 (Gyuris et al.. supra) was made 
from mRNA of four week-old Arabidopsis leaves. The yeast strain EGY48, the vector plasmids pJG4-5 and pEG202, 
and the plasmids pHMM , pHM7-3, pHM12, pHMfc and DSH18-34 were provided by Dr. Roger Brent. The LexA/AKR 
45 fusion proteins were constructed as follows. TTie oligonucleotides used to amplify desired AKR fragments which were 
later subcloned into pEG202 are shown below. 

OAB-9: GCGGAATTCATGAGGCCCATTAAAATT (SEQ ID NO: 3) 
OAB-10: GTAGGATCCGGTCGGATTTCTTGTCGC (SEQ ID NO: 4) 
OAB-1 1: CGCGAATTCAATAGCGACAAGTACGAT (SEQ ID NO: 5) 
so OAB-12: GTAGGATCCGTCTCTCTTCCAAGGTAGA (SEQ ID NO: 6) 

OAB-20: GATCCTAGAATTCAAGAAGAATCGGCGTGGC (SEQ ID NO: 7) 
The combination of oligonucleotides used for fusion proteins are: OAB-9 and OAB-10 (LexA/AKR 1 -261); OAB-1 1 and 
OAB-12 (LexA/AKR249-434); OAB-20 and OAB-12 (LexA/AKR1 14-434). Normally, with this technique.'a library that 
expresses cDNA-encoded proteins fused to a transcription activator domain (B42) is introduced into a special yeast 
55 strain. This strain also contains a plasmid which directs constitutive production of a transcriptionally inert LexA fusion 
protein which is called the "bait" (LexA fused to the protein of interest) and two reporter genes. TTie transcription of these 
two reporter genes can be stimulated if the cDNA-encoded protein complexes with the bait. One reporter gene LEU2 
allows growth in the absence of leucine and the other reporter gene LacZ codes for p-galactosidasa 
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We found that many proteins encoded by Arabidopsis cONAs activated transcription with LexA protein alone, or 
with many different baits, although all of these proteins required a LexA binding domain. This results in the isolation of 
cDNA clones which are not true interaction partners of the "bait" and requires further analysis to separate these Talse 
positive" clones from the desired partner clones. Examples of activation by AFT1 which is dependent upon the presence 
of LexA are shown in Fig. 2. To further understand such activation, we characterized 81 cDNA clones which encoded 
proteins capable of activating the expression of the reporter genes. Among the cDNAs sequenced. 36 clones were 
derived from the same gene which encodes a 14-3-3-like protein. This gene was named AFT1 (Arabidopsis Fourteen- 
Three-three 1). and the protein AFT1 encodes is designated as AFT1 . AFT1 contains 248 amino acids with a molecular 
weight of about 28 kO. 

Transcription Activation bv AFT1 

A series of experiments were performed to determine which AFT1 sequences were required for transcriptional 
activation in the yeast interaction trap system. Accordingly, a series of deletion constructs were made and analyzed 
according to methods known in the art as follows. To test activation by B42/AFT1 fusion proteins, a series of AFT1 
derivatives fused to B42 in the plasmid pJG4-5 were constructed. These plasmids were introduced into the strain EGY48 
containing the plasmid pEG202 which directs the constitutive production of LexA protein and the plasmid pSH18-34 
which contains the LexAop-LacZ reporter gene. To test activation by LexA/AFT1 fusion proteins, a series of AFT1 deriv- 
atives were fused to LexA in the plasmid pEQ202 were constructed and were introduced into the strain EQY48 containing 
the plasmid pSH18-34. Transcription activation by AFT1 and its derivatives was measured by the growth of yeast on 
leucine minus plates and the activity of p-galactosidase. The assay for p-galactosidase was conducted as described by 
Zervos et al., supra. The oligonucleotides used to amplify desired AFT1 fragments which were later subcioned into 
pJG4-5 and pEG202 are shown below. 

JW-5: CTGACTGAATTCATGGCGGCGACATTAGG (SEQ ID NO: 8) 

JW-6: GACTGAGTCX3ACCCTTCATCTAGATCCTC (SEQ ID NO: 9) 

JW-7: GACTGACTCGAGCCTTCATCTAGATCCTCA (SEQ ID NO: 10) 

JW-8: CTGACTGAATTCGAGTCTAAGGTCTTTAC (SEQ ID NO: 1 1) 

JW-9: GACTGACTCGAGACTCGCTCCAGCAGATGG (SEQ ID NO: 12) 

JW-10: GACTGACTCGAGTGAAGAATTGAGAATCTC (SEQ ID NO; 13) 

JW-1 1 : GACTGAGTCGACACTCGCTCCAGCAGATGG (SEQ ID NO: 14) 

JW-12: GACTGAGTCGACTGAAGAATTGAGAATCTC (SEQ ID NO: 15) 

JW-13: CTGACTGAATTCGTTACAGGCGCTACTCCAG (SEQ ID NO: 16) 
The combinations of oligonucleotides used for fusion proteins were: JW-5 and JW-6 (LexA/1-248); JW-5 and JW- 
12 (LexA/1-194); JW-5 and JW-1 1 (LexA/1-121); JW-13 and JW-6 (LexA/34-248); JW-8 and JW-6 (LexA/1 22-248); JW- 
5 and JW-7 (B42/1 -248); JW-5 and JW-9 (B42/1-121); JW-13 and JW-7 (B42/34-248); JW-8 and JW-7 (B42/1 22-248) • 
JW-13 and JW-10 (B42/34-194). 

Results from such experiments revealed that deletion of the C-terminal half of AFT1 (B42/1-121) completely abol- 
ished AFTVs ability to activate, whereas deletion of either 33 or 121 residues from the N-terminus (B42/34-248 and 
B42/122-248) increased activation (Fig. 3A). The reason for the increased activation is not known, but might be due to 
the tertiary structures of these two fusion proteins (B42/34-248 and B42/1 22-248) which could result in stronger inter- 
actions with the transcriptional machinery. Nevertheless, it is the C-terminal half that is responsible for the observed 
activation when AFT1 is fused to B42, e.g., AFT1 residues 34-248 (SEQ ID NO: 2) and 1 22-248 (SEQ ID NO: 2). However, 
since B42 is an activator domain, the observed transcription activation may be due to the direct interaction of AFT1 with 
LexA, thereby bringing B42 into the proximity of the reporter gene promoter. An alternate possibility is suggested by the 
acidic nature of AFT1 (pi = 4.6), namely, AFT1 itself might be a transcription activator, since it shares this acidic feature 
with many transcription activators. 

AFT1 was also fused directly to LexA to test if AFT1 can activate transcription. The results shown in Fig. 3B dem- 
onstrate that AFT1 does activate transcription. To determine which portion of AFT1 was important for activation, 54 
amino acids were deleted from the AFT C-terminus (LexA/1 -1 94). This deletion caused AFT1 to lose its ability to activate 
completely; whereas deletion of 33 amino acids from the N-terminus, (LexA/34-248) decreased activation by about 75%. 
As shown in Panel B of Fig. 3, when the N-terminal half of AFT1 (LexA/1 22-248) was deleted, activation dropped to 
basal levels. Thus, even though the C-terminal half is critical for activation and is more acidic than the N-terminal half, 
the N-terminal half also plays a role in activation. 

AFT1 Copy Number 

The copy number of the AFT1 gene was determined by genomic DNA (Southern) Wot analysis. Genomic DNA was 
prepared according to the method of Deilaporta et al. (Plant Mol. Biol. Rep. 4:19-21, 1983), digested with restriction 
enzymes, electrophoresed (5jig per lane), blotted to a Biotrans™ Nylon membrane, and hybridized with labeled ATF1 
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cDNA clone. Hybridizations were carried out according to the method of Church and Gilbert (Proc. Natt. Acad. Sci. USA 
81:1991-1995, 1984) using probes labeled by random priming. The washing conditions were as follows: two times (10 
minutes each) in 0.5% BSA, 1 mM EDTA, 40mM NaHP04 (pH 7.2), and 5.0% SDS at 63«C; then four times (5 minutes 
each) in 1 mM EDTA, 40mM NaHP04 (pH 7.2), and 1% SDS at 63°C. The condition for deprobing filters was as follows: 
two times (15 minutes each) in 2mM Tris (pH 8.2), 2mM EDTA (pH 8.0). and 0.1% SDS at 70°C for DNA blots and at 
80<>C for RNA Hots. 

As shown in Fig. 4, digestion of two ecotypes (Columbia and Landsberg) of Arabidopsis DNA with the enzymes, 
Bgl II and Hind III, gave rise to two bands after the DNA blot was probed with a labelled AFT1 cDNA sequence. These 
data indicate that only one copy of AFT1 was present in both ecotypes of Arabidopsis, since there was one restriction 
site for Bgl II and one site for Hind III within the AFT1 cDNA, respectively. 

Developmental Expression Pattern of the AF T1 Gene In Arahiriflp sfc 

The developmental and organ-specific expression of AFT1. as well as the light regulation of AFT1 expression, were 
studied by RNA (Northern Wot) analysis. Total RNA was isolated according to the method of Logemann et al. (Anal. 
Biochem. 163:16-20, 1987), separated by electrophoresis (15 ng per lane), blotted to a Biotrans™ Nylon membrane, 
and hybridized to the labeled AFT1 cDNA clone and the Arabidopsis Lhca2 cDNA clone. The conditions for hybridization 
and washing were the same as described in genomic Southern analysis supra. RNAs were extracted from Arabidopsis 
grown either in a greenhouse (16 hr light/8 hr dark at 25 ± 5°C) or on agarose plates in a tissue culture room (16 hr 
light/8 hr dark at 20 ± 2°C). Greenhouse-grown plants were used for developmental expression analyses. Leaves were 
harvested weekly for RNA preparation. Greenhouse-grown plants were also used for light induction experiments. At tour 
weeks, plants were moved to a dark chamber tor three days, then shifted back to light. Leaves were then harvested 
every two hours. Tissue culture-grown plants were used for organ-specific expression analyses. Leaf, root, and stem 
mRNAs were isolated from plants grown tor 35 days on agarose plate in MS media supplemented with 1 % sucrose, and 
the flower and silique mRNAs were isolated from plants grown for 35 days in the greenhouse. The MS was purchased 
from Sigma (Cat# M-0153). As shown in Fig. 5. Panel A and Table I, when total RNAs isolated from leaves of one to five 
week-old plants were hybridized to a labelled AFT1 cDNA, the steady-state mRNA level of AFT1 did not change signif- 
icantly over a five week period. 

When RNAs isolated from different organs were analyzed, the steady-state mRNA level in silique was found to be 
about one fifth of that in flower, whereas the mRNA levels in leaves, roots, and stems were about the same (Fig. 5, Panel 
B; Table I). It should be noted that the mRNA levels from flowers and siliques are not directly comparable to those from 
leaves, roots, and stems (Fig, 5, Panel B), because they were from materials grown under different conditions (as 
described supra). However, the steady-state mRNA levels of flower and silique can be compared to that of five-week- 
old leaves shown in Fig. 5, Panel A. The quantitative data indicate that the AFT1 mRNA level in leaves is about two 
times higher than that in flowers and nine times higher than that in siliques (Table I, infra). The growth conditions can 
affect the steady-state mRNA level since greenhouse-grown plants contained three times more AFT1 mRNA than plate- 
grown plants (Figs. 5, Panels A and B; Table I, infra). These data indicate that although AFT1 expression is probably 
required throughout much of the Arabidopsis life cycle, its steady-state mRNA level is still regulated organ-specifically. 
Furthermore, dark-adapted plants contain at least two times more steady-state mRNA than plants grown in light (Fig. 
5, Panel C. Table I, infra), suggesting that light plays a role in the down-regulation of AFT1 expression. 

The relative intensities of AFT1 mRNA derived from the data in Figs. 5A-5C are shown below in Table I. The relative 
intensity data were collected from p-scanning of RNA gel Wots by a Blot Analyzer, and normalized using the intensity of 
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We have shown that the AFT1 gene of Arabidopsis encodes a novel protein which can activate transcription i n yeast 
Accordingly, we conclude that AFT1 functions as a transcriptional activator. 
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Chimeric AFT1 Proteins As Targeted Transcri ptional Activators 



Since plant gene expression varies in accordance with developmental stages of different cell types and in response 
to different environmental factors and hormonal cues, the proteins (including the gene regulatory sequences) of the 
5 present invention are most useful for applications aimed at improving or engineering plant varieties of agricultural or 
commercial interest. 

Accordingly, the invention, in general terms, also involves the construction of and use of novel chimeric AFT1 proteins 
capable of selectively activating transcription of a specified gene, e.g. , a crucifer storage protein such as napin. Targeted 
transcription of a gene is acquired by imbuing the AFT1 transcriptional activator with the ability to selectively activate a 

w specific gene by fusing it to a DNA-binding domain which is capable of binding to the 5' upstream regulatory region, 
e.g., in the vicinity of the transcription start site. Such chimeric proteins contain two parts: the AFT1 transcriptional 
activation region (described supra) and a DNA binding domain that is di rected to or specific for the transcriptional initiation 
region of interest. For example, a chimeric AFT1 transcriptional activator protein may be produced by fusing a Gal4 DNA 
binding region (see, e.g., Ma et al. Nature, 334:631-633, 1933; Ma et al. Cell 48; 847*853, 1988) to the transcriptional 

is activating portion of AFT1 according to methods known in the art (e.g., see Sadowski et al., Nature 335:563-564, 1 988). 
Importantly, the gene of interest e.g., a napin storage protein gene, placed under the transcriptional control of an 
AFT1 chimeric activator must include the appropriate DNA recognition sequence in its 5' upstream region. For example, 
to activate napin gene expression with a GaJ4-AFT1 protein, the napin gene should contain a 5' QAL4 upstream activation 
sequence (UAS). Construction of such clones is well known in the art and is discussed infra. Moreover, those skilled in 

20 the art will easily recognize that the DNA binding domain component of the chimeric activator protein may be derived 
from any appropriate eukaryotic or prokaryotic source. Thus, fusion genes encoding chimeric AFT1 transcriptional acti- 
vator proteins can be constructed which include virtually any DNA binding domain and the AFT1 transcriptional activator 
provided that the gene placed under the transcriptional control of the AFT1 chimeric activator contains the requisite DNA 
regulatory sequences which facilitates its binding. Such chimeric AFT1 transcriptional activator proteins are capable of 

25 activating transcription efficiently in transgenic plants (plasmid construction discussed infra). Furthermore, cells express- 
ing such chimeric AFT1 transcriptional activator proteins, e.g., AFT1A3al4. are capable of specifically activating and 
overexpressing the desired gene product 

To identify effective chimeric AFT1 transcriptional activator proteins in vivo or in vitrot functional analyses are per- 
formed. Such assays may be carried out using transiently transformed plant cells or transgenic plants harboring the 

30 appropriate transgenes. e.g.. an AFT1 /Gal4 transcriptional activator and a storage protein promoter region containing 
the requisite Gal4 DNA binding sequences, according to standard methods (see, e.g.. Gelvin et al., supra). 

To identify particularly useful combinations, i.e., chimeric AFT1 activators and its cognate genes, piasmids are con- 
structed and analyzed in either transient assays or in vivo in transgenic plants. Construction of chimeric transgenes is 
by standard methods (see, e.g., Ausubel et al, supra). The wild-type promoter of a specific gene, e.g., the crucifer napin 

35 storage protein, containing the regulatory region the appropriate DNA-binding sequence, e.g., Gal4, is fused to a reporter 
gene, for example, the p-glucuronidase gene (GUS) (see, e.g., Jefferson. Plant. Mol. Biol. Rep. 316: 387, 1987) in a 
plant expression vector and introduced into a host by any established method (as described infra) along with the cognate 
AFT1 chimeric transcriptional activator expression construct. By "reporter gene" is meant a gene whose expression may 
be assayed; such genes include, without limitation, p-glucuronidase (GUS), luciferase, chloramphenicol transacetylase 

40 (CAT), and p-galactosidase. In one particular example, the expression vector is transformed into Agrobaderium followed 
by transformation of the plant material, e.g., leaf discs (see, e.g.. Gelvin et al. infra). Regenerated shoots are selected 
on medium containing, e.g.. kanamycin. After rooting, transgenic plantlets are transferred to soil and grown in a growth 
room. 

Primary transformants are then assayed for chimeric AFT 1 - induced GUS activity either by quantitating GUS activity 
45 or by histochemicai staining as described beiow. Untransformed plants are taken as controls. 

Fluorometric analysis of GUS activity can be performed in any plant cell protoplast or transgenic plant according to 
standard methodologies. Alternatively, preparations of crude plant extracts can be assayed as described, e.g., by Jef- 
ferson (supra), using extracts standardized for protein concentration (see, e.g., Bradford, Anal. Biochem. 72: 248, 1976). 
GUS levels in different plant tissues are assayed by enzymatic conversion of 4-methylumbeliiferyl glucuronide to 4- 
so methylumbelliferone, which is quantified with a fluorimeter (e.g., Per kin-Elmer LS 2B, Norwalk, CT). Typically, the f iuo- 
rimeter is set at 455 nm emission and 365 nm excitation wavelengths. GUS activity is generally expressed as picomoles 
per milligram of protein per minute (see, e.g., Jefferson supra). 

Alternatively, GUS activity can be assayed by in situ histochemicai staining, e.g., as follows. Whole tissues and thin 
sections from transgenic plants and untransformed control plant tissue can be stained by incubation with 5-bromo-4- 
55 chloro-3-indoyl p-D-glucuronic acid (X-gluc; Research Organics, Inc., Cleveland OH) as described by Jefferson et al 
(EM BO J 6: 3901 , 1987) and Gallagher (GUS Protocols, 1992). Tissue sections are incubated at 37°C in 2 mM X-gluc 
in 0.1 M NaP0 4 (pH 7.0), and then sectioned. GUS activity in a transformed plant is easily identified by the presence of 
an indigo blue precipitate within the cells expressing the reporter gene. Stained material is optionally examined micro- 
scopically using bright-field and dark-field optics. 
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AFT1 Interacting Proteins 

Other properties of the AFT1 protein can be explored by modifying the interaction trap system described supra. For 
example, proteins which interact with AFT1 can be isolated and identified. To this end, we used a LexA and partial AFT1 
fusion protein as a bait (LexAfAFTI 33-194, i.e., AFT1 residues 33-194 fused to LexA) to search for proteins capable 
of interacting with AFT1. We identified five novel cDNAs showing sequence homology to several plant genes, including 
plant defense related gene products, e.g., 3-O-rnethyltransferase (see, e.g.. Poeydomenge et at. Plant Physiol. 1 05:749- 
750, 1994 and Jaek et al.. Mol. Plant-Microbe Interactions 5594-300. 1992) and ascorbate peroxidase (see. e.g., Mittler 
et al., Plant J. 5:397-405, 1994; Mehdy, Plant Physiol. 105:467-472, 1994). the proteasome gene product (see. e.g., 
Haffter et al., Nucleic Acids Res. 19:5075. 1991), and an ankryin repeating protein gene product. AKR 2 . The nucleotide 
sequences for these cDNAs are 6hown in Figs. 6 (SEQ ID NO: 1 7), 8 (SEQ ID NO: 19), 10 (SEQ ID NO: 21), 12 (SEQ 
ID NO: 23), and 1 4 (SEQ ID NO: 25). The deduced amino acid sequences coded for by these cDNAs are shown in Figs 
7 (SEQ ID NO: 18), 9 (SEQ ID NO: 20), 1 1 (SEQ ID NO: 22), 13 (SEQ ID NO: 24), and 15 (SEQ ID NO: 26). 

AFT1 Polypeptide Expression 

Polypeptides according to the invention may be produced by transformation of a suitable host cell with all or part of 
an AFT1 cDNA (e.g.. the cDNA described above) in a suitable expression vehicle or with a plasmid construct designed 
to express the chimeric AFT1 transcriptional activator protein supra 

Those skilled in the field of molecular biology will understand that any of a wide variety of expression systems may 
be used to provide the recombinant protein. The precise host cell used is not critical to the invention. The AFT1 protein 
or chimeric activator protein may be produced in a prokaryotic host, e.g., E. coli, or in a eukaryotic host, e.g., Saccha- 
romyces cerevisiae, mammalian cells (e.g., COS 1 or NIH 3T3 cells), or any of a number of plant cells including, without 
limitation, algae, tree species, ornamental species, temperate fruit species, tropical fruit species, vegetable species, 
legume species, monocots, dicots, or in any plant of commercial or agricultural significance. Particular examples of 
suitable plant hosts include Chlamydomonas. Conifers, Petunia, Tomato, Potato, Tobacco, Arabidopsis, Lettuce, Sun- 
flower, Oilseed rape, Flax, Cotton, Sugarbeet Celery, Soybean, Alfalfa, Medicago, Lotus, Vigna, Cucumber, Carrot, 
Eggplant, Cauliflower, Horseradish, Morning Glory, Poplar, Walnut, Apple, Asparagus. Rice, Corn, Millet, Onion, Barley! 
Orchard grass, Oat, Rye, and Wheat 

Such cells are available from a wide range of sources including: the American Type Culture Collection (Rockland, 
MD); Chlamydomonas Culture Collection, (Duke University). Durham. North Carolina: or from any of a number seed 
companies, e.g., W. Atlee Burpee Seed Co. (Warminster. PA), Park Seed Co. (Greenwood, SC), Johnny Seed Co. 
(Albion, ME), or Northrup King Seeds (Harstville, SC). Descriptions and sources of useful host cells are also found in 
Vasil I.K., Cell Culture and Somatic Cell Genetics of Plants, Vol I, II, III Laboratory Procedures and Their Applications 
Academic Press. New York, 1984; Dixon, RA, Plant Cell Culture-A Practical Approach, IRL Press, Oxford University, 
1985; Green et al., Plant Tissue and Cell Culture, Academic Press, New York, 1987; Gasser and Fraley, Science 
244:1293, 1989. 

For prokaryotic expression, DNA encoding an AFT1 polypeptide of the invention is carried on a vector operably 
linked to control signals capable of effecting expression in the prokaryotic host. If desired, the coding sequence may 
contain, at its 5' end, a sequence encoding any of the known signal sequences capable of effecting secretion of the 
expressed protein into the periplasmic space of the host cell, thereby facilitating recovery of the protein and subsequent 
purification. Prokaryotes most frequently used are various strains of E. coli; however, other microbial strains may also 
be used. Plasmid vectors are used which contain replication origins, selectable markers, and control sequences derived 
from a species compatible with the microbial host. Examples of such vectors may be found in Pouwels et al. (supra) or 
Ausubel et al. (supra). Commonly used prokaryotic control sequences (also referred to as "regulatory elements") are 
defined herein to include promoters for transcription initiation, optionally with an operator, along with ribosome binding 
site sequences. Promoters commonly used to direct protein expression include the beta-lactamase (penicillinase), the 
lactose (lac) (Chang et al., Nature 198: 1056, 1977), the tryptophan (Trp) (Goeddel et al., Nucl. Acids Res. 8: 4057, 
1980) and the tac promoter systems as well as the lambda-derived P L promoter and N-gene ribosome binding site 
(Simatake eta!., Nature 292:128, 1981). 

For eukaryotic expression, the method of transformation or transfection and the choice of vehicle for expression of 
the AFT1 polypeptide or chimeric activator protein will depend on the host system selected. Transformation and trans- 
fection methods are described, e.g., in Ausubel et al. (supra); Weissbach and Weissbach, Methods for Plant Molecular 
Biology, Academic Press, 1989; Gelvin et al.. Plant Molecular Biology Manual. Wuwer Academic Publishers. 1990; Kin- 
dle. K. t Proc. Natl. Acad. Sci.. USA 87:1228, 1990; Potrykus, I., Annu. Rev. Plant Physiol. Plant Mol. Biology 42505, 
1991 ; and BioRad (Hercules, CA) Technical Bulletin #1687 (Biolistic Particle Delivery Systems). Expression vehicles 
may be chosen from those provided, e.g., in Cloning Vectors: A Laboratory Manual (P.H. Pouwels et al., 1985, Supp. 
1 987); Gasser and Fraley (supra); Clontech Molecular Biology Catalog (Catalog 1 992/93 Tools for the Molecular Biolo- 
gist, Palo Alto, CA); and the references cited above. 



10 



EP 0 693 554 A1 



One preferred eukaryotic expression system is the mouse 3T3 fibroblast host cell transfected with a pMAMneo 
expression vector (Clorrtech, Palo Alto, CA). pMAMneo provides: an RSV-LTR enhancer linked to a dexamethasone- 
inducible MMTV-LTR promotor, an SV40 origin of replication which allows replication in mammalian systems, a selectable 
neomycin gene, and SV40 splicing and polyadenylation sites. DNA encoding an AFT1 polypeptide would be inserted 
into the pMAMneo vector in an orientation designed to allow expression. The recombinant AFT1 protein would be isolated 
as described below. Other preferable host cells which may be used in conjunction with the pMAMneo expression vehicle 
Include COS cells and CHO cells (ATCC Accession Nos. CRL 1650 and CCL 61 . respectively). 

Alternatively, an AFT1 polypeptide is produced by a stably-transfected mammalian cell line. A number of vectors 
suitable for stable transfection of mammalian cells are available to the public, e.g., see Pouwels et al. (supra); methods 
for constructing such cell lines are also publicly available, e.g., in Ausubel et al. (supra). In one example, cDNA encoding 
the AFT1 polypeptide is cloned into an expression vector which includes the dihydrofolate reductase (DHFR) gene 
Integration of the plasmid and, therefore, the AFT1 -encoding gene into the host cell chromosome is selected for by 
inclusion of 0.01 -300 jiM methotrexate in the cell culture medium (as described in Ausubel et al., supra). This dominant 
selection can be accomplished in most cell types. Recombinant protein expression can be increased by DHFR-mediated 
amplification of the transfected gene. Methods for selecting cell lines bearing gene amplifications are described in 
Ausubel et al. (supra); such methods generally involve extended culture in medium containing gradually increasing levels 
of methotrexate. DHFR-containing expression vectors commonly used for this purpose include pCVSEII-DHRF and 
pAdD26SV(A) (described in Ausubel et al.. supra). Any of the host cells described above or, preferably, a DHFR-def icient 
CHO cell line (e.g.. CHO DHFR cells, ATCC Accession No. CRL 9096) are among the host cells preferred for DHFR 
selection of a stably-transfected cell line or DHFR-mediated gene amplification. 

Most preferably, an AFT1 polypeptide or AFT1 chimeric transcriptional activator is produced by a stably-transfected 
plant cell line or by a transgenic plant. A number of vectors suitable for stable transfection of plant cells or for the estab- 
lishment of transgenic plants are available to the public; such vectors are described in Pouwels et al. (supra), Weissbach 
and Weissbach (supra), and Gelvin et al. (supra). Methods for constructing such ceil lines are described in, e.g., 
Weissbach and Weissbach (supra), and Gelvin et al. (supra). Typically, plant expression vectors include (1) a cloned 
plant gene under the transcriptional control of S and 3' regulatory sequences and (2) a dominant selectable marker. 
Such plant expression vectors may also contain, if desired, a promoter regulatory region (e.g.. one conferring inducible 
or constitutive, environmentally* or developmentally-regulated, or cell- or tissue-specific expression), a transcription ini- 
tiation start she, a ribosome binding site, an RNA processing signal, a transcription termination site, and/or a polyade- 
nylation signal. 

Once the desired AFT1 nucleic acid sequences is obtained it may be manipulated in a variety of ways known in the 
art. For example, where the sequence involves non-coding flanking regions, the flanking regions maybe subjected to 
mutagenesis. 

Trie AFT1 DNA sequence of the invention may, if desired, be combined with other DNA sequences in a variety of 
ways. The AFT1 DNA sequence of the invention may be employed with all or part of the gene sequences normally 
associated with the AFT1 protein. In its component parts a DNA sequence encoding an AFT1 protein is combined in 
the DNA construct having a transcription initiation control region capable of promoting transcription and translation in a 
host cell. 

In general, the constructs will involve regulatory regions functional in plants which provide for modified production 
of AFT1 protein or a chimeric AFT1 protein as discussed supra. TTie open reading frame coding for the AFT1 protein or 
functional fragment thereof will be joined at its 5' end to a transcription initiation regulatory region such as the sequence 
naturally found in the 5' upstream region of the AFT1 structural gene. Numerous other transcription initiation regions 
are available which provide for constitutive or inducible regulation. 

For applications when developmental, hormonal or environmental expression is desired appropriate 5* upstream 
■non-cuding regions are obtained from other genes; for example, from genes regulated during seed development, embryo 
development, or leaf development 

Regulatory transcript termination regions may be also be provided in DNA constructs of this invention as well. Tran- 
script termination regions may be provided by the DNA sequence encoding the AFT1 protein or any convenient tran- 
scription termination region derived from a different gene source, especially the transcript termination region which is 
normally associated with the transcript initiation region. The transcript termination region will contain preferably at least 
1 kb. preferably about 3 kb of sequence 3' to the structurally gene from which the termination region is derived. Plant 
expression constructs having AFT1 as the DNA sequence of interest for expression thereof may be employed with a 
wide variety of plant life, particularly plant life involved in the production of seed storage proteins or storage lipids, useful 
fa industrial and agricultural applications. Importantly, this invention is applicable to dicotyledons and monocotyledons, 
and will be readily applicable to any new or improved transformation or regeneration method. 

An example of a useful plant promoter according to the invention is a caulimovirus promoter, e.g., a cauliflower 
mosaic virus (CaMV) promoter. These promoters confer high levels of expression in most plant tissues, and the activity 
of these promoters is not dependent on viraily encoded proteins. CaMV is a source for both the 35S and 1 9S promoters. 
In most tissues of transgenic plants, the CaMV 35S promoter is a strong promoter (see, e.g. ( Odell et al., Nature 313: 
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810, 1985). The CaMV promoter is also highly active in monocots (see, e.g., Dekeyser et al., Ptant Cell 2:591, 1990; 
Terada and Shimamoto, Mol. Gen. Genet. 220:389, 1990). Moreover, activity of this promoter can be further increased 
(i.e., between 2-10 fold) by duplication of the CaMV 35S promoter (see e.g., Kay et al., Science 236:1299, 1987; Ow et 
al., Proc. Natl. Acad. Sci., USA 84: 4870, 1987; and Fang et al., Plant Cell 1 : 141, 1989). 

Other useful plant promoters include, without limitation, the nopaline synthase promoter (An et al., Plant Physiol. 
88: 547, 1988) and the octopine synthase promoter (Fromm et el. Plant Cell 1 : 977, 1989). 

For certain applications, it may be desirable to produce the AFT1 gene product in an appropriate tissue, at an 
appropriate level, or at an appropriate developmental time. Thus, there are an assortment of gene promoters, each with 
its own distinct characteristics embodied in its regulatory sequences, shown to be regulated in response to the environ- 
ment, hormones, and/or developmental cues. These include gene promoters that are responsible for (1) heat-regulated 
gene expression (see, e.g., Callis et al., Plant Physiol. 88: 965, 1988), (2) light-regulated gene expression (e.g., the pea 
rbcS-3A described by Kuhlemeier et al., Plant Cell 1 : 471 , 1989; the maize rbcS promoter described by Schaffner and 
Sheen, Plant Cell 3: 997, 1991; or the cholorphyll a/b-binding protein gene found in pea described by Simpson et al., 
EMBO J. 4: 2723, 1985), (3) hormone-regulated gene expression (e.g., the abscisic acid responsive sequences from 
the Em gene of wheat described by Marcotte et al., Plant Cell 1:969, 1989), (4) wound-induced gene expression (e.g., 
of wunl described by Siebertz et al.. Plant Cell 1: 961, 1989), or (5) organ-specific gene expression (e.g., of the tuber- 
specific storage protein gene described by Roshal et al., EMBO J. 6:1155, 1987; the 23-kDa zein gene from maize 
described by Schernthaner et al., EMBO J. 7: 1249, 1988; or the French bean p-phaseolin gene described by Bustos 
et al., Plant Cell 1 :839, 1989). 

Plant expression vectors may also optionally include RNA processing signals, e.g, introns, which have been shown 
to be important for efficient RNA synthesis and accumulation (Calli& et al., Genes and Dev. 1 : 1 183, 1 987). The location 
of the RNA splice sequences can dramatically influence the level of transgene expression in plants. In view of this fact, 
an intron may be positioned upstream or downstream of a AFT1 polypeptide-encoding sequence in the transgene to 
modulate levels of gene expression. 

In addition to the aforementioned 5* regulatory control sequences, the expression vectors may also include regulatory 
control regions which are generally present in the 3' regions of plant genes (Thornburg et al., Proc. Natl. Acad. Sci. USA 
84: 744, 1 987; An et al., Plant Cell 1 : 1 1 5, 1 989). For example, the 3' terminator region may be included in the expression 
vector to increase stability of the mRNA. One such terminator region may be derived from the Pl-ll terminator region of 
potato. In addition, other commonly used terminators are derived from the octopine or nopaline synthase signals. 

The plant expression vector also typically contains a dominant selectable marker gene used to identify those cells 
that have become transformed. Useful selectable genes for plant systems include genes encoding antibiotic resistance 
genes, for example, those encoding resistance to hygromycin, kanamycin. bleomycin. G418, streptomycin or spectino- 
mycin. Genes required for photosynthesis may also be used as selectable markers in photosynthetic-deficient strains. 
Finally, genes encoding herbicide resistance may be used as selectable markers; useful herbicide resistance genes 
include the bar gene encoding the enzyme phosphinothricin acetyltransferase and conferring resistance to the broad 
spectrum herbicide Basta® (Hoechst AG, Frankfurt, Germany). 

Efficient use of selectable markers is facilitated by a determination of the susceptibility of a plant cell to a particular 
selectable agent and a determination of the concentration of this agent which effectively kills most if not ail, of the 
transformed cells. Some useful concentrations of antibiotics for tobacco transformation include, e.g., 75-100 \xgfm\ (kan- 
amycin), 20-50 jig/ml (hygromycin), or 5-10 jig/ml (bleomycin). A useful strategy for selection of transformants for her- 
bicide resistance is described, e.g.. by Vasil et al., supra. 

It should be readily apparent to one skilled in the art of molecular biology, especially in the field of plant molecular 
biology, that the level of gene expression is dependent not only on the combination of promoters, RNA processing 
signals and terminator elements, but also on how these elements are used to increase the levels of selectable marker 
Qons sxprGSoicn. 

Plant Transformation 

Upon construction of the plant expression vector, several standard methods are accessible for introduction of the 
recombinant genetic material into the host plant for the generation of a transgenic plant. These methods include (1) 
Agrobacterium-mediated transformation (A. tumefaciens or A. rhizogenes) (see, e.g., Lichtenstein and Fuller In: Genetic 
Engineering, vol 6. PWJ Rigby, ed, London, Academic Press, 1987; and Lichtenstein, OR, and Draper, J„ In: DNA 
Cloning, Vol II, D.M. Glover, ed, Oxford, IRI Press, 1985), (2) the particle delivery system (see. e.g., Gordon-Kamm et 
al.. Plant Cell 2:603. 1990; or BioRad Technical Bulletin 1687, supra). (3) microinjection protocols (see, e.g., Green et 
al., supra), (4) polyethylene glycol (PEG) procedures (see, e.g., Draper et al.. Plant Cell Physiol. 23:451, 1982; or e.g., 
Zhang and Wu, Theor. Appl. Genet. 76:835. 1988), (5) liposome-mediated DNA uptake (see, e.g. Freeman et al., Plant 
Cell Physiol. 25: 1353, 1984), (6) electroporation protocols (see, e.g., Gelvin et al.. supra; Dekeyser et al., supra; or 
Fromm et al., Nature 319: 791, 1986). and (7) the vortexing method (see, e.g., Kindle supra). The method of transfor- 
mation is not critical to the instant invention; various method of plant transformation are currently available (supra). As 
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newer methods are available to transform crops or other host cells they may be directly applied Accordingly, a wide 
variety of methods have been developed to insert a DNAsequeroeirto the gene ^ 

or transcript wkJ translation of the sequence to effect phenotypic changes in both dicots and monocots. Moreover the 
manner m which the DMA construct is introduced into the plant host is not critical to the invention. Thus, any method 
which provides for efficient transformation maybe employed. ' 

The following is an example outlining an Agrobacterium-mediated plant transformation. The general process for 
m !?™ f 09 9en6S t0 transferred int0 genome of plant cells is carried out in two phases. First, all the cloning 
and DNA modtf ication steps are done in E. coli, and the plasmid containing the gene construct of interest is transferred 
by conjugation into Agrobacterium. Second, the resulting Agrobacterium strain is used to transform plant cells Thus 
for the generalized plant expression vector, the plasmid contains an origin of replication that allows it to replicate in 
Agrobacterium and a high copy number origin of replication functional in E. coli. This permits facile production and testinq 
of transgenes in E. coli prior to transfer to Agrobacterium for subsequent introduction into plants. Resistance genes can 
be carried on the vector, one for selection in bacteria, e.g. ( streptomycin, and the other that will express in plants e g 
a gene encoding for kanamycin resistance or an herbicide resistance gene. Also present are restriction endonuclease 
sites for the addition of one or more transgenes operably linked to appropriate regulatory sequences and directional T- 
DNA border sequences which, when recognized by the transfer functions of Agrobacterium. delimit the region that will 
be transferred to the plant 

In another example, plants cells may be transformed by shooting into the cell tungsten rraaoprojectiles on which 
zoo 11? q S P rec 'P'* a,ec '- * n tne Bfo'istic Apparatus (Bio- Rad. Hercules, CA) used for the shooting, a gunpowder charge 
(22 caliber Power Piston Tool Charge) or an air-driven blast drives a plastic rnacroprojectile through a gun barrel An 
aliquot of a suspension of tungsten particles on which DNA has been precipitated is placed on the front of the plastic 
rnacroprojectile. The latter is fired at an acrylic stopping plate that has a hole through it that is too small for the macro- 
projectile to go through. As a result, the plastic rnacroprojectile smashes against the stopping plate and the tungsten 
microprojectiles continue toward their target through the hole in the plate. For the instant invention the target can be any 
plant cell, tissue, seed, or embryo. The DNA introduced into the cell on the microprcjectiles becomes integrated into 
either the nucleus or the chloroplast. 

Transfer and expression of transgenes in plant cells is now routine practice to those skilled in the art. It has become 
a major tool to carry out gene expression studies and to attempt to obtain improved plant varieties of agricultural or 
commercial interest. 

Transgenic P lant Reoenfiratinq 

Plants cells transformed with a plant expression vector can be regenerated, e.g., from single cells, callus tissue or 
leaf discs according to standard plant tissue culture techniques. It is well known in the art that various cells tissues and 
organs from almost any plant can be successfully cultured to regenerate an entire plant; such techniques are described 
e.g., in Vasil supra; Green et al., supra; Weissbach and Weissbach, supra: and Gelvin et al., supra 

In one particular example, a cloned AFT1 polypeptide under the control of the 35S CaMV promoter and the nopaline 
synthase terminator and carrying a selectable marker (e.g., kanamycin resistance) is transformed into Agrobacterium 
Transformation of leaf discs (e.g., of tobacco leaf discs), with vector-containing A^ 

by Horsch et al. (Science 227: 1229. 1985). Putative transformants are selected after a few weeks (e.g.. 3 to 5 weeks) 
on plant tissue culture media containing kanamycin (ag. 100 ng/ml). Kanamycin-resistant shoots are then placed on 
plant tissue culture media without hormones for root initiation. Kanamycin-resistant plants are then selected for green- 
house growth. If desired, seeds from self-fertilized transgenic plants can then be sowed in a soil-less media and grown 
in a greenhouse. Kanamycin-resistarrt progeny are selected by sowing surfaced sterilized seeds on hormone-free kan- 
amya n-containina media. Analvsis for th« int*wira«nn nf tha trancta^ B i. MMM tuk«j u.. . . , 

- , - a a*-...- .« uuvvn^wiw uy aiaiiuctiu ItiUUUUUttS 1866, 

e.g., Ausubel et al. supra; Gelvin et al. supra). 

Transgenic plants expressing the selectable marker are then screened for transmission of the transgene DNA by 
standard immunoblot and DNA detection techniques. Each positive transgenic plant and its transgenic progeny are 
unique in comparison to other transgenic plants established with the same transgene. Integration of the transgene DNA 
into the plant genomic DNA is in most cases random and the site of integration can profoundly effect the levels and the 
tissue and developmental patterns of transgene expression. Consequently, a number of transgenic lines are usually 
screened for each transgene to identify and select plants with the most appropriate expression profiles 

Transgenic lines are evaluated on levels of transgene expression. Expression at the RNA level is determined initially 
to identify and quantitate expression-positive plants. Standard techniques for RNA analysis are employed and include 
PCR amplification assays using oligonucleotide primers designed to amplify only transgene RNA templates and solution 
hybridization assays using transgene-specif ic probes (see, e.g., Ausubel et al., supra). The RNA-positive plants are then 
analyzed for protein expression by Western immunoblot analysis using AFT1 specific antibodies (see, eg., Ausubel et 
al., supra). In addition, in situ hybridization and immunocytochemistry according to standard protocols can be done using 
transgene-specif ic nudeotide probes and antibodies, respectively, to localize sites of expression within transgenic tissue 
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Once tanontart AFT1 protein is expressed in any cell or in a transgenic plant (e.g.. as described above) it 
mayb BK »laed.e^ 

nAusube etal.. supra, or by any standard technique) may be m^iote^uaJMbSSS^SS 
Lys* and frart-onation of AFT1 -producing cells prior to affinity chromatograprryma^^^^ 

(see. e g.. Ausubel et al.. supra). Once isolated, the recombinam protein^,. Leered be^TrpuSe^ bSh 

AFTHra^na^'or'arak^s. 68 a ^^ c ^ >& ^ 6 expression and purification can also be used to produce and isolate useful 

' ,=e ""^WHcations. however, expressionofthetransgerwintheplartwllorthetransgenicplammaybethedesired 
sSL^T* app,icatio ~ 600ha8 AFT1 regulation of modulating pta^^^ 

3-0-methyHransferase or ascorbate peroxidase, or altering the normal development of the plant. °" 

Use 

Introduction of AFT1 or a chimeric AFT1 transcriptional activator into a transformed plant cell facilitates the manio- 
uteton of deve.opm e nta. events. For example, transgenic plants of the instant invention expret ^^n™ o?e?A^ 
^ Cr,Pt0na ' 1 f C,Wator * UMd 10 al,er - simply and in «Pensrvely. or reguSte plaS gene exjTJSn 
e^nts! 6 meChan ' Sm ' e<PreSSi0n « P 18 "* ^ «"*««■. or any numbeTof other plant dlSJSS 

Other Emhrtriimafffl? 

The invention also includes any biologically active fragment or analog of a crucifer AFT1 protein. By "biolooicallv 

m Fig. (SEQ ID N02). Because crucifer AFT1 protein exhibits a range of physiological properties and because such 
properties may be attributable to different portions of the crucifer AFT1 protein moTecule a useful K^ESS 5 
ZS, l°™ T?** 8 biol °9 ical ■* biological assay for AFT1 transcriptional acSon wSng 

to a particular signal transduction pathway. response 

... ^ re ' errec ' analogs include AFT1 proteins (or biologically active fragments thereof) whose sequences differ from the 
w-W-typesequenceonl^^ 

2 8608 (ftfl - VaRne fer 9lyCine ' arflinine for etc.) or by one or more non^se^Xe amtno 

acd substitutions, deletions, or insertions which do not abolish the polypeptide's biological activity 

Analogs can differ from naturally occurring AFT1 protein in amino acid sequence or can be modified in ways that 

JTJT^^Tt W ^ Ana ' 0gS 01 0,8 invBnfo " «* «** * 'east 70%. prefe S a^ nxi 

preferably 90%. and most preferably 95%oreven 99%, horr»logywithasegmemof20aminoacWresS SeS 
40 an.no aod rescues, or more preferably the entire sequence of a nau^ly occurring AFT1 polypeptir^uenT 
Alterations in primary sequence include genetic variants, both natural and induced. Also included are an^oasthat 
include reeiduesother man naturally occumngL^ino adds. e.g.. 

^'"eluded mtheinvention are crucifer AFT1 prrteinsmooTfied by in vivo or in vit/o chemical derivati^ 

tries including acetylation. methylation. phosphorylation, carboxylation, or glycosylate otpoiypep- 

In addition tO SUhCtanliallu (llll.lonrrtf. nnl, mnn tu„ iU. : .! . . 

r,«iw„^.>j a ~ -7 ' ■- o- r-»i-=p'~»o. «">«'»wiuuiiaiboinciuoBSDioiogicaiiyactivefragmentsofthe 

polypeptides. As used herein, the term fragment", as applied to a polypeptide, will ordinarily be at least 20 r~iZl 
r^re typify at least 40 residues, and preferably a, least 60 residuVmTenglh. 

IT1 L Pro 2 8,n 080 b8 aSSeSS6d "* 81086 me1tlods described herein - "*> ""dud* in the invention are 
cruder AFT1 protems contaming residues that are not required for biological activity of the peptide, e.g th.Se added 
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by alternative mRNA splicing or alternative protein processing events. 

SEQUENCE LISTING 



10 



15 



25 



(1) GENERAL INFORMATION! 

(i) APPLICANT t 

(ii) TITLE OP INVENTION: 

(ill) NUMBER OF SEQUENCES t 
(iv> CORRESPONDENCE ADDRESS I 

(A) ADDRESSEE: 

(B) STREETS 

(C) CITY: 
<D) STATE I 

(E) COUNTRY 1 

(F) ZIPt 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: 

(B) COMPUTER I 

(C) OPERATING SYSTEM: 

(D) SOFTWARE: 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 



Zhang et al. 

CRUCIFER AFT PROTEINS AND USES 
THEREOF 

26 



Fish & Richardson 
225 Franklin Street 
Boston 

Massachusetts 

U.S.A. 

02110-2604 



3.5- Diskette, 1.44 Mb 
IBM PS/2 Model 50Z or 55SX 
MS-DOS (Version 5.0) 
WordPerfect (Version 5.1) 



35 



(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Lech, Karen F. 

(B) REGISTRATION NUMBER: 35,238 

(C) REFERENCE/ DOCKET NUMBER: 00786/219001 



40 



45 



(ix) TELECOMMUNICATION INFORMATION: 

<A) TELEPHONE: (617) 542-5070 

(B) TELEFAX: (617) 542-8906 
<C) TELEX: 200154 



(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 
(i) SEQUENCE CHARACTERISTICS: 



1: 



(A) LENGTH: 

(B) TYPE: 

(C) STRANDEDNESS t 
<D) TOPOLOGY: 



845 

nucleic acid 

single 

linear 



<xi) SEQUENCE DESCRIPTION t SEQ ID NO: 1: 



55 



AAAAAAAAAT CAAATCTCTC TCTTTCTCTC TCTAATGGCG CCGACATTAG GCAGAGACCA 



60 
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20 



25 



GTATGTGTAC ATGGCGAAGC TCGCCGAGCA GGCGGAGCGT TACGAAGAGA TGGTTCAATT 120 

CATGGAACAG CTCGTTACAG GCGCTACTCC AGCGGAAGAG CTCACCGTTG AAGAGAGGAA 180 

TCTCCTCTCT GTTGCTTACA AGAACGTGAT CGGATCTCTA CGCOCCCCCT GGAGGATCGT 240 

CTCTTCGATT GAGCAGAAGO AAGAGAGTAG GAAGAACGAC GAGCACGTGT CGCTTGTCAA 300 

GGATTACAGA TCTAAAGTTG AGTCTGAGCT TTCTTCTGTT TGCTCTGGAA TCCTTAAGCT 360 

CCTTGACTCG CATCTGATCC CATCTGCTGG AGCGAGTGAG TCTAAGGTCT TTTACTTGAA 420 

GATGAAAGGT GATTATCATC GGTACATGGC TCAGTTTAAG TCTGGTGATG AGAGGAAAAC 480 

TGCTGCTCAA GATACCATGC TCGCTTACAA AGCACCTCAG GATATCGCAG CTGCGGATAT 540 

GGCACCTACT CATCCGATAA GGCTTGGTCT GGCCCTGAAT TTCTCAGTGT TCTACTATOA 600 

GATTCTCAAT TCTTCAGACA AAGCTTGTAA CATGGCCAAA CAGGCTTTTG AGGAGGCCAT 660 

AGCTGAGCTT GACACTCTGG GAGAGGAATC CTACAAAGAC AGCACTCTCA TAATGCAGTT 720 

GCTCAGGGAC AATTTAACCC TTTGGACCTC CCATATGCAG GAGCAGATGG ACGAGGCCTG 780 

AGGATCTAGA TGAAGGGGGG GAGGGTTGTT ACGCCATGTT TCTGCCACCA AATCGATCTC 840 

AAAAT 345 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER t 2: 
(i) SEQUENCE CHARACTERISTICS t 

(A) LENGTH t 248 

(B) TYPE i amino acid 

(C) STRANDEDNESS i single 

(D) TOPOLOGY t linear 

(xi) SEQUENCE DESCRIPTION t SEQ ID NOt 2: 

Met Ala Ala Thr Leu Gly Arg ABp Gin Tyr Val Tyr Met Ala Lys Leu 
15 10 15 

Ala Glu Gin Ala Glu Arg Tyr Glu Glu Met Val Gin Phe Met Glu Gin 
20 25 30 

Leu Val Thr Gly Ala Thr Pro Ala Glu Glu Leu Thr Val Glu Glu Arg 
35 40 45 

Aon Leu Leu Ser Val Ala Tyr Lys Ann Val lie Gly Ser Leu Arg Ala 
50 55 60 

45 Ala Trp Arg lie Val Ser Ser He Glu Gin Lys Glu Glu Ser Arg Lys 

65 70 75 80 

Aen Asp Glu His Val Ser Leu Val Lys Asp Tyr Arg Ser Lys Val Glu 
85 90 95 

SO 
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w 



15 



Ser Glu Leu Ser Ser Val Cye Ser Gly He Leu Lys Leu Leu Asp Ser 
XOO 105 110 

His Leu He Pro Ser Ala Gly Ala Ser Clu Ser Lye Val Phe Tyr Leu 
115 120 125 

Lys Met Lys Gly Asp Tyr Hie Arg Tyr Met Ala Glu Phe Lys Ser Gly 
130 135 140 

Asp Glu Arg Lye Thr Ala Ala Glu Asp Thr Met Leu Ala Tyr Lye Ala 
145 150 155 160 

Ala Gin Asp He Ala Ala Ala Asp Met Ala Pro Thr His Pro He Arg 
165 170 175 

Leu Gly Leu Ala Leu Asn Phe Ser Val Phe Tyr Tyr Glu He Leu Asn 
180 185 190 

Ser Ser Asp Lys Ala Cys Asn Met Ala Lys Gin Ala Phe Glu Glu Ala 
195 200 205 

He Ala Glu Leu Aep Thr Leu Gly Glu Glu Ser Tyr Lys Asp Ser Thr 
210 215 220 

Leu He Met Gin Leu Leu Arg Asp Asn Leu Thr Leu Trp Thr Ser Asp 
225 230 235 240 

Met Gin Glu Gin Met Asp Glu Ala 248 
245 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER t 3: 
(i) SEQUENCE CHARACTERISTICS t 

(A) LENGTH: 27 

(B) TYPE: nucleic acid 
35 (C) STRANDEDNESSt Single 

(D) TOPOLOGY I linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NOi 3: 

40 GCGGAATTCA TGAGGCCCAT TAAAATT 27 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER I 4: 
(X) SEQUENCE CHARACTERISTICS! 

AK 

(A) LENGTH: 27 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY t linear 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

GTAGGATCCG GTCGGATTTC TTGTCGC 27 

55 
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is 



(2) INFORMATION TOR SEQUENCE IDENTIFICATION NUMBER: 5: 
(i) SEQUENCE CHARACTERISTICS t 

(A) LENGTH i 27 

(B) TYPEt nucleic acid 

(C) STRANDEONESS : single 

(D) TOPOLOGY i linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

CGCGAATTCA ATAGCGACAA GTACGAT 27 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 6: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 

(B) TYPE : nucleic acid 
20 (C) STRANDEONESS : Bingle 

<D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

25 GTAGGATCCG TCTCTCTTCC AAGGTAGA 28 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 7: 
(i) SEQUENCE CHARACTERISTICS: 

30 

(A) LENGTH: 31 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

GATCCTAGAA TTCAAGAAGA ATCGGCGTGG C 31 

40 (2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 8: 

(i) SEQUENCE CHARACTERISTICS J 

(A) LENGTH: 29 

(B) TYPEs_ nucleic acid 

45 (C) STRAnuEDAfiSai single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: B: 

50 CTGACTGAAT TCATGGCCGC GACATTAGG 29 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 9: 
(i) SEQUENCE CHARACTERISTICS: 

55 
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(A) LENGTH J 29 

<B) TYPE! nucleic acid 

5 <C) STRANDEDNESS: single 

(D) TOPOLOOTi linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
w GACTGACTCG ACCCTTCATC TAGATCCTC 29 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 10 1 
(i) SEQUENCE CHARACTERISTICS l 

15 (A) LENGTH: 30 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 
<D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

GACTGACTCG ACCCTTCATC TAGATCCTCA 30 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 11: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 29 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

CTGACTGAAT TCGAGTCTAA GGTCTTTAC 29 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 12: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
GACTGACTCG AGACTCGCTC CAGCAGATGG 30 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 13; 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH t 30 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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20 



25 



(D) topologys linear 

<xi) SEQUENCE DESCRIPTIONS SEQ ID NO: 13: 

GACTGACTCG AGTGAAGAAT TGAGAATCTC 30 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER t 14 x 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH I 30 

<B) TYPES nucleic acid 

(C) STRANDEDNESS t single 

<D) TOPOLOGY t linear 

(xi) SEQUENCE DESCRIPTIONS SEQ ID NOs 14s 

GACTGAGTCG ACACTCGCTC CAGCAGATGG 30 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBERS 15: 
(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 30 

(B) TYPES nucleic acid 

(C) STRANDEDNESS s single 

(D) TOPOLOGY t linear 

(xi) SEQUENCE DESCRIPTIONS SEQ ID NO: 15: 

GACTGAGTCG AGTGAAGAAT TGAGAATCTC 30 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 16: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 31 

(B) TYPEs nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY s linear 

(xi) SEQUENCE DESCRIPTION! SEQ ID NO: 16: 

CTGACTGAAT TCGTTACAGG CGCTACTCCA G 31 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 17: 
(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 567 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS s single 

(D) TOPOLOGYs linear 
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20 



25 



30 



35 



<xi) 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 


17; 






TCACCCAGAG 


AGGTCAGGCT 


TTGATGGACC 


ATGGACCCAA 


GAGCCGCTGA 


AGTTTGACAA 


60 


CTCCTACTTC 


GTGGAACTCC 


TGAAAGGAGA 


ATCAGAGGGC 


TTGTTGAAAC 


TTCCAACTGA 


120 


CAAGACCTTA 


TTGGAAGACC 


CGGAGTTCCG 


TCGTCTTGTT 


GAGCTTTATG 


CAAAGGATGA 


180 


AGATGCATTC 


TTCAGAGACT 


ACGCGGAATC 


GCACAAGAAA 


CTCTCTGAGC 


TTGGTTTCAA 


240 


CCCAAACTCC 


TCAGCAGGCA 


AAGCAGTTGC 


AGACAGCACG 


ATTCTGGCAC 


AGAGTGCGTT 


300 


CGGGGTTGCA 


GTTGCTGCTC 


CGGTTGTGGC 


ATTTGGTTAC 


TTTTACGAGA 


TTCGGAAGAG 


360 


GATGAAGTAA 


ACGAAATAGG 


AAGGAAAACA 


CGAAGCAACG 


ATGCTCTTAT 


TTGGGTATTA 


420 


AAGAAACTAT 


TAATCGTCTA 


TCGAATCTAT 


TTTGCTCCTA 


CAAGATTCTA 


AACTCTTTGA 


480 


ATCCACGATT 


CCACTGTTTA 


GTAG TAAAAA 


AGTTAAAAAG 


TCAATATTTT 


GGGTCCGTGA 


540 


TTCATTTTTG 


CGATAAA 










557 



(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER! 18: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH I 122 

(B) types amino acid 

(C) STRANDEDNESS i single 

(D) TOPOLOGY I linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

His Pro Glu Arg Ser Gly Phe Asp Gly Pro Trp Thr Gin Glu Pro Leu 
15 10 15 

Lys Phe Asp Asn Ser Tyr Phe Val Glu Leu Leu Lye Gly Glu Ser Glu 
20 25 30 

Gly Leu Leu Lys Leu Pro Thr Asp Lys Thr Leu Leu Glu Asp Pro Glu 
35 40 45 

40 Phe Arg Arg Leu Val Glu Leu Tyr Ala Lys Asp Glu Asp Ala Phe Phe 

50 55 60 

Arg Asp Tyr Ala Glu Ser His Lys Lys Leu Ser Glu Leu Gly Phe Asn 
65 70 75 80 

45 Pro Asn Ser Ser Ala Gly Lys Ala Val Ala Asp Ser Thr lie Leu Ala 

85 90 95 

Gin Ser Ala Phe Gly Val Ala Val Ala Ala Ala Val Val Ala Phe Gly 

100 105 110 

so Tyr Phe Tyr Glu lie Arg Lys Arg Met Lys 122 
115 120 
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w 



15 



20 



25 



40 



45 



50 



(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 19: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH l 478 

(B) types nucliec acid 

(C) STRANDEDNESS t single 

(D) TOPOLOGY i linear 



(xi) 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 


19: 






GAGTGACGAA 


CATTGCGTGA 


AATTCTTGAA 


GAACTGCTAC 


GAGTCACTTC 


CAGAGGATGG 


60 


AAAAGTGATA 


TTAGCAGAGT 


GTATTCTTCC 


AGAGACACCA 


GACTCAAGCC 


TCTCAACCAA 


120 


ACAAGTAGTC 


CATGTCGATT 


GCATTATCTT 


GGCTCACAAT 


CCCGGAGGCA 


AAGAACGAAC 


180 


CGAGAAAGAG 


TTTGAGGCAT 


TAGCCAAAGC 


ATCAGGCTTC 


AAGGGCATCA 


AAGTTGTCTG 


240 


CGACGCTTTT 


CGTGTTAACC 


TTATTGAGTT 


ACTCAAGAAG 


CTCTAAAAAC 


AAACAATGTT 


300 


CCTATGAAGA 


TGATTTATAT 


GTAAACATTA 


TCTCATATCT 


CCTTCCACGG 


TTCCAAAACT 


360 


ATGCTGTTTA 


ATAATGGTTT 


TTACAAGAAT 


TTGATTATGA 


GTTTGTATTT 


TTGTTTGTTT 


420 


GGAACAAAAT 


TATGTGATTA 


TAGGGAAAAA 


TAAAATGAGC 


TATTATTGAA 


GAAAAAAA 


478 



(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER! 20: 
(i) SEQUENCE CHARACTERISTICS! 

(A) LENGTH t 94 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY t linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Ser Asp Glu His Cye Val LyB Phe Leu Lys Asn Cys Tyr Glu Ser Leu 
15 10 15 

Pro Glu Asp Gly Lye Val He Leu Ala Glu cys He Leu Pro Glu Thr 
20 25 30 

Pro Abp Ser Ser Leu Ser Thr Lys Gin Val Val Hie Val Asp Cys He 
35 40 45 

Met Leu Ala His Asn Pro Gly Gly LyB Glu Arg Thr Glu Lys Glu Phe 
50 55 60 

Glu Ala Leu Ala Lys Ala Ser Gly Phe Lye Gly He Lys Val Val Cys 
65 70 75 80 

Asp Ala Phe Gly Val Asn Leu He Glu Leu Leu Lys Lys Leu 94 
85 90 
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(2) IKFORMATION POR SEQUENCE IDENTIFICATION NUMBER I 21 1 

(i) SEQUENCE CHARACTERISTICS t 

(A) LENGTH t 1357 

<B) TYPE i nucleic acid 

<C) STRANDEDNESS: single 

(D) TOPOLOGY! linear 



15 



25 



30 



35 



40 



45 



<xi) 


SEQUENCE DESCRIPTION I 


SEQ ID NO: 


21: 






CCAGATTATC 






AGGAAAAATC 


CTCTTCTTTC 


AGATGAGAAA 


60 


CCCAAATCGA 




X AAunVf i 1 vl 


AAGCCGGAAT 


CAGCTTCTGG 


GAGTTCAACT 


120 


TCATCAGCTA 


TGCCTGGCTT 


GAATTTCAAT 


GCTTTTGATT 


TCTCTAATAT 


GGCTAGTATT 


180 


CTCAACGATC 


w X nuwn 1 LAW 


AuAm X w 1 


GAGCAAATAG 


CTAAAGATCC 


TGCCTTTAAC 


240 


CAATTCGCTG 




GAGATCTATT 


CCTAACCCTG 


CCCAGGAAGG 


TGGTTTCCCT 


300 


AACTTTGATC 


CTCAACAwTA 


TO T GaaTA WA 


ATGCAACAGG 


TTATGCATAA 


CCCTGAGTTT 


360 


AAGACAATGG 


LvununAnL x 


TflftTAPPRPf* 
X X AbLu^b 


TTAGTTCAGG 


ATCCACAAAT 


GTCTCCTTTT 


420 


TTGGATGCTT 


iLl Lunnllib 


TR&&APAP.PA 


GAACACTTTA 


CTGAGCGTAT 


GGCGCGGATG 


480 


AAAGAAGATC 




IPPTftTlPTA 
ALU X A X AU X A 


GATGAGATTG 


ATGCTGGTGG 


TCCTTCTGCC 


540 


ATGATGAAGT 




TCC AG A ACTS 


CTGAAAAAGC 


TGGGTGAAGC 


AATGGGTATG 


600 


CCTGTTGCTG 


CfTTACT Afi A 


eCAGACTCTT 


TCAGCTGAAC 


CTGAGGTAGC 


AGAAGAAGGT 


660 


GAAGAAGAAG 


Aw ivlnl XAj X 


•PrAPrAAAfT 


GCCAGTCTTG 


GTGATGTTGA 


GGGTTTGAAA 


720 


GCTGCCTTGG 


CATCTGGTGG 


TAACAAAGAT 


GAAGAAGATT 


CtGAAGGAAG 


GACAGCATTG 


780 


CATTTTGCTT 


GTGGATACGG 


CGAGTTGAAA 


TGTGCTCAAG 


TTCTTATCGA 


TGCTGGAGCA 


840 


AGTGTTAATG 


CGGTTGACAA 


AAACAAGAAC 


ACACCTCTGC 


ATTATGCTGC 


TGGTTACGGG 


900 


AGGAAAGAGA 


GTGTAAGCCT 


TCTCCTGGAG 


AATGGTGCTG 


CAGTCACTCT 


GCAAAACCTA 


960 


GACGAGAAGA 


CGCCAATTGA 


TGTAGCGAAG 


CTCAACAGCC 


AGCTGGAGGT 


GGTGAAGCTG 


1020 


CTTGAGAAGG 


ATGCTTTCCT 


TTGAGCTCTG 


CTGGTTAAAG 


GAAAGCTCTA 


AGCTCATATT 


1080 


GTCTTTGAGG 


CATTTGTCTT 


GTGTGTGTCC 


TGAACCACTT 


TCACAGGCTT 


TTTGTGTACA 


1140 


CTTTTTATTA 


GTTCCTCTCT 


TCTTCTAAAT 


TTGTCTCTTA 


TGTTGTTTTA 


AAAGTCAATA 


1200 


AAGAAAGAAA 


TAGCAATCAA 


TGATTTAATT 


TATGATTATA 


TTCTTTATTT 


CGTCGACCTC 


1260 


TACACAATGA 


TTCAATTTGG 


AAGAATCATT 


CTGGTTTGGA 


GGATATGTAA 


GAAAAACTAC 


1320 


TTGATCTCCA 


AGTTATTCCA 


TTCTTCTGTT 


GAAAAAA 






1357 



50 (2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 22 1 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH J 339 

55 
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^ino acid 

(C) STRANDEDHESSl Binola 

(D) TOPOLOGY, iiSSiJ 

(xi) SEQUENCE DBSCRIPIIONl SGQ ID NO: 22 t 



Cly Thr Arg Lys Aon Pro Leu Leu Ser Asp Glu Lye Pro Lye Ser Thr 

5 10 15 

Olu Glu Aen Lye Ser Ser Lye Pro Glu Ser Ala Ser Gly Ser Ser Thr 
20 25 30 

Ser Ser Ala Met Pro Gly Leu Aen Phe Asn Ala Phe Aep Phe Ser Aen 

Ja 40 45 

Met Ala ser He Leu Aen Asp Pro Ser lie Arg Glu Met Ala Glu Gin 
" 55 60 

Ue Ala Lye Asp Pro Ala Phe Aen Gin Leu Ala Glu Gin Leu Gin Arg 
70 75 8 g 

Ser lie Pro Asn Ala Gly Gin Glu Cly Gly Phe Pro Aen Phe Aep Pro 
85 90 95 

Gin Gin Tyr Val Aen Thr Met Cln Gin Val Met Hie Aen Pro Glu Phe 

100 105 110 

Lye Thr Met Ala Glu Lye Leu Cly Thr Ala Leu Val Gin Aep Pro Gin 
i16 120 125 

Met ser Pro Phe Leu Aep Ala Phe Ser Aen Pro Clu Thr Ala Glu Hie 
J,J 135 140 

Phe Thr Glu Arg Met Ala Arg Met Lye Glu Aep Pro Glu Leu Lye Pro 
1,5 150 155 * 160 

lie Leu Asp Glu lie Asp Ala Gly Gly Pro Ser Ala Met Met Lye Tyr 
165 170 175 * 

Trp Aen Asp Pro Glu Val Leu Lye Lye Leu Gly Glu Ala Met Cly Met 
180 185 190 

Pro Val Ma Gly Leu Pro Asp Gin Thr Val ser Ala Glu Pro Glu Val 
195 200 205 

Ala Glu Glu Gly Clu Clu Glu Glu Ser lie Val Hie Gin Thr Ala Ser 

215 220 

Leu Cly Aep Val Glu Gly Leu Lye Ala Ala Leu Ala Ser Gly Gly Aen 
25 230 235 * 240 

Lys Asp Glu Clu Aep Ser Glu Gly Arg Thr Ala Leu His Ph« »i. 

250 ""iH"'" 
Gly Tyr Gly Glu Leu Lye Cys Ala Cln Val Leu lie Asp Ala Gly Ala 

Ser Val Asn Ala Val Asp Lye Asn Lys Asn Thr Pro Leu Hie Tyr Ala 
2'5 280 285 

Ala Cly Tyr Cly Arg Lye Glu Ser Val Ser Leu Leu Leu Olu Aen Gly 

295 300 
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Ala Ala Val Thr Leu Cln Aan Leu Aap Clu Lvb Thr Pro He Aap Val 
305 310 315 320 

Ala Lya Leu Aan Ser Gin Leu Glu Val Val Lya Leu Leu Glu Lvb Asd 
325 330 335 

Ala Phe Leu 



(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 23: 
(i) SEQUENCE CHARACTERISTICS! 

(A) LENGTH! 663 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



339 





(xi) 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 


23: 






20 


TTTTAAAAAA 


TTTTGCCATC 


AACCGTAGAT GTTCCGCCAA AGGGTGGGTT 


TAGCTTCGAT 


60 




CTGTGTAAGA 


GAAATGATAT 


TCTTACACAA AAGGGTCTTA AAGCTCCGTC TTTTTTGAAG 


120 




ACTGGAACAA 


CCATTGTTGG 


TTTGATTTTC 


AAGGATGGTG 


TGATACAAGG 


GGCAGATACC 


180 


25 


CGAGCAACTG 


AGGGGCCAAT 


TGTTGCTGAT 


AAGAACTGTG 


AGAAGATTCA 


CTATATGGCA 


240 




CCAAACATAT 


ATTGCTGTGG 


TGCAGGAACT 


CGGGCTGATA 


CTGAAGCAGT 


CACTGATATG 


300 




GTCAGCTCAC 


AGCTGCGATT 


GCATCGTTAC 


CAGACTGGTC 


GAGACTCTCG 


GGTCATTACT 


360 


30 


GCTTTCACCC 


TTCTCAAAAA 


ACATTTTTTC 


AGCTACCAAG 


GTCATGTCTC 


TGCTGCTCTT 


420 




GTACTCGGTG 


GAGTTGATAT 


CACTGCTCCA 


CATCTGCATA 


CTATATACCC 


ACACGGTTCA 


480 




ACTGACACTC 


TTCCATTCGC 


CACAATGGGT 


TCGGGTTCTC 


TTGCTGCTAT 


GTCTGTGTTT 


540 


35 


GAGGCAAAGT 


ATAAAGAAGG 


CCTAACTAGC 


GATGAAGGAA 


TTAAGCTGGT 


CGCTGAATCC 


600 




ATATGCTCGG 


GTATATCCAA 


TGACCTGGGT 


AGTGGTAGCA 


ACGTGGACAT 


CTGCGTGATC 


660 




ACA 












663 



(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 24: 
<i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 219 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) topology i linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Lya He Leu Pro Ser Thr Val Aap Val Pro Pro Lys Gly Gly Phe Ser 
15 10 15 

Phe Aap Leu CyB Lya Arg Asn Asp He Leu Thr Gin Lya Gly Leu Lya 
20 25 30 
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Ala Pro Ser Phe Leu Lye Thr Cly Thr Thr lie Val Gly Leu He Phe 
35 40 45 

Lye Asp cly Val lie Gin Cly Ala Asp Thr Arg Ala Thr Glu Gly Pro 
50 55 60 

lie Val Ala Asp Lya Aen Cys Glu Lys II. Hi. Tyr Met Ala Pro Aon 
05 70 75 80 

He Tyr Cye Cye Cly Ala Cly Thr Arg Ala Aap Thr Glu Ala Val Thr 
85 90 95 

Asp Met Val Ser Ser Cln Leu Arg Leu His Arg Tyr Cln Thr Gly Arg 

Asp Ser Arg Val He Thr Ala Leu Thr Leu Leu Lys Lys His Phe Phe 
115 120 125 

Ser Tyr Cln Gly Hie Val Ser Ala Ala Leu Val Leu Gly Gly Val Asp 
1JO 135 140 

lie Thr Gly Pro His Leu His Thr lie Tyr Pro His Gly Ser Thr Asp 
145 150 155 i 6 o 

Thr Leu Pro Phe Ala Thr Met Gly Ser Gly Ser Leu Ala Ala Met Ser 

170 175 

Val Phe Glu Ala Lys Tyr Lys Glu Cly Leu Thr Arg Asp Glu Gly He 
180 185 190 

Lys Leu Val Ala Glu Ser He Cys Ser Gly He Ser Asn Asp Leu Gly 
195 200 205 

Ser Gly Ser Asn Val Asp He Cys Val He Thr 
210 215 

(3) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER i 25 i 
(i) SEQUENCE CHARACTERISTICS f 

(A) LENGTH I 975 

(B) TYPE 1 nucleic acid 

(C) BTRANDEDNBSSl single 

(D) TOPOLOGY t linear 

(Xi) SEQUENCE DESCRIPTION i SEQ ID NO» 25: 

ACGAGAGGCC CTGAGACGCG GCAGATATCA GGTCCTGCGA CTTCAACACA GATCAGCAAC 
TTCACATTAT GTCAGCATCT GCAAGGAATC CACACACATA TCTCATCCAT GGTAGCGGAC 

CTTCCCAGTA TTrtfTAiTrcft •ws'paT'wmw ^-f.m««,«— - . 

* wvi.niv.iw, WUWAATLJXA TAATGCGGCA 

TGTGAGCCAG TTACACCTTT GTTTAAAGCA ATGCGAGACA AGCTCGAGTC ATGCATTCTT 
CAAATCCATG ATCAAAACTT TCCTCCTCAT CACGCTCACA TGGACAACAA CGCTTCCTCA 
TACATCGAGC AGTTGCAGAG ATCGATTCTT CACTTCCGCA AGGAGTTCCT ATCTAGACTA 
TTGCCTTCCG CAGCAAATCC TAACACTGCA GGAACAGAAT CCATCTGCAC AACACTCACA 
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AGACAAATCG CGTCAACGGT TTTGATCTTC TACATCAGAC ATGCATCCCT TGTGCGACCA 
CTTTCAGAAT GGGGAAAACT CAQAATGGCC AAAGACATGG CCGAGCTGGA ACTAGCAGTG 
GGACAGAATC TATTTCCCCT GGAACAACTC GGAGCACCGT ACAGAGCTCT TAGAGCGTTT 
AGCCCTTTCG TTTTCCTCGA AACATCTCAA ATCGGATCAT CTCCTCTCAT CAATGATCTA 
CCACCGAGCA TCGTCCTACA TCATCTCTAC ACAAGAGGCC CAGACGAGTT AGAGTCACCG 
ATGCAGAAGA ACAGACTAAG TCCTAAACAG TACTCACTGT CGCTTGATAA CCAAAGAGAG 
GATCAGATCT GGAAAGGGAT AAAAGCAACT TTGGATGATT ATGCAGTGAA GATCAGATCG 
AGAGGGGACA AAGAGTTTAG TCCAGGTTAT CCTCTAATGC TTCAAATTGG TTCATCTTTA 
ACACAAGAAA ACTTATAACC TGTGCTTTGT TACCGAATCA ATATTCTTCT ATTGCGAACT 
TTTTTGTCTC AAAAAA 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 26: 
(i) SEQUENCE CHARACTERISTICS I 

(A) LENGTH: 305 

(B) TYPE; amino acid 
<C) STRANDEDNESSi single 

(0) topology t linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 26: 

Thr Arg Gly Pro Glu Thr Arg Gin He Ser Gly Pro Ala Thr Ser Thr 
1 5 10 as 

Gin He Arg Asn Phe Thr Leu Cys Gin Hie Leu Gin Gly He His Thr 
20 25 30 

His He Ser Ser Met Val Ala Asp Leu Pro Ser He Ala Thr Asp Val 
35 40 45 

Leu Ser Pro Tyr Leu Ala Ala He Tyr Asn Ala Ala Cya Glu Pro Val 
50 55 60 

Thr Pro Leu Phe Lys Ala Met Arg Asp Lys Leu Glu Ser Cys He Leu 
65 70 75 80 

Gin He His Aep Gin Asn Phe Gly Ala Asp Asp Ala Asp Met Aep Asn 
85 90 95 

Asr. Ala Scr Ssr Tyr Met Glu Glu Leu Gin Arg ser lie Leu His Phe 
100 105 no 

Arg Lys Glu Phe Leu Ser Arg Leu Leu Pro Ser Ala Ala Asn Ala Asn 
115 120 125 

Thr Ala Gly Thr Glu Ser He Cys Thr Arg Leu Thr Arg Gin Met Ala 
130 135 14 q 

Ser Arg Val Leu He Phe Tyr He Arg His Ala Ser Leu Val Arg Pro 
145 150 155 y 16Q 
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Leu Ser Glu Trp Gly Lys Leu Arg Met Ala Lye Asp Met Ala Glu Leu 
165 170 

Glu Leu Ala Val Gly Gin Asn Leu Phe Pro Val Glu Gin Leu Gly Ala 
180 185 190 

Pro Tyr Arg Ala Leu Arg Ala Phe Arg Pro Leu Val Phe Leu Glu Thr 
195 200 205 

Ser Gin Met Gly Ser Ser Pro Leu He Aan Asp Leu Pro Pro Ser lie 
210 215 220 

Val Leu Hia Hie Leu Tyr Thr Arg Gly Pro Aap Glu Leu Glu Ser Pro 
,5 225 230 235 240 

Met Gin Lya Aan Arg Leu Ser Pro Lya Gin Tyr Ser Leu Trp Leu Asp 
245 250 255 

Asn Gin Arg Glu Asp Gin lie Trp Lye Gly lie Lya Ala Thr Leu Asp 
260 265 270 



20 



25 



Aap Tyr Ala Val Lye He Arg Ser Arg Gly Asp Lys Glu Phe Ser Pro 
275 280 265 

Gly Tyr Pro Leu Met Leu Gin He Gly Ser Ser Leu Thr Gin Glu Aan 
29 « 295 300 



Claims 

1 . Recombinant AFT1 polypeptide. 

35 

4. The polypeptide of claim 3. wherein said polypeptide is AFT1 (34-248) or AFT1 (122-248). 

5. The polypeptide of claim 1. 2 or 3. wherein said polypeptide is derived from a plant 



45 

S. Tne polypeptide of daim 5, wherein said plant is a cnicifer. 

7. The polypeptide of claim 6, wherein said plant is Arabidopsis. 

so 8. A chimeric AFT1 transanal activator protein comprising an Am polypeptide fused to a DNA-binding polypep- 

9 ' K lS AFT1 <ranSWipti0nal aCfivatw - **" 8. wherein said DNA-binding polypeptkJe comprises 

55 

10 ' ;52^ W * Satan89We ^ anAFT, ^yPeptideoperattylin^toaco^trtutiveorreg. 
1 1 . A transgenic plant containing a transgene comprising a chimeric AFT1 of daim 8. 
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12. A seed from a transgenic plant of claim 10 or 1 1. 

1 3. A ceil from a transgenic plant of claim 1 0 or 11 . 

5 14. A transgenic plant expressing a polypeptide of interest comprising: 

(a) a nucleic acid sequence encoding the chimeric AFT1 transcriptional activator protein of claim 8- and 

15. me polypeptide of claim 14. wherein said polypeptide comprises a plant storage protein gene. 

16. Substantially pure DNA encoding an AFT1 protein. 

18. The DNA of claims 16 and 17, wherein said DNA is operably linked to a constitutive or regulated promoter. 
so 1 9. The DNA of claim 1 8, wherein said DNA is cDNA. 

20. Ttie DNA of claim 1 8, wherein said DNA is of the genus Arabidopsis. 

22. Acellwhicha>ntainstheDNAofdaiml6 l claim17,orthevectaofclaim2l. 

23. The cell of claim 22, said cell being a plant cell. 

24. A transgenic plant which contains the substantially pure DNA encoding an AFT1 protein. 

26. A seed from a transgenic plant of claim 24 or claim 25. 

27. A cell from a transgenic plant of claim 24 or claim 25 

40 

29. The polypeptide of claim 28, wherein said polypeptide is AFT1 (33-194). 

30. Substantially pure DNA encoding an AFT1 polypeptide fragment or analog of claim 28. 

The DNA of claim 30, wherein said DNA is substantially identical to the DNA sequence shown in SEQ ID NO: 1 
bo 32. TheDNAofclaimSL wherein said DNA is operably linked to a constitutive or regulated promoter. 



25 



30 



35 



55 
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(SEQ ID NO: 1) 

1 AAAAAAAAATCAAATCTCTCTCTTTCTCTCTCTAATGGCGGCGACATTAGGCAGAGACCA 

MAATLGRDQ 9 

61 GTATGTGTACATGGCGAAGCTCGCCGAGCAGGCGGAOCOTTACGAAQAGATGGTTC^TT ^ ^ 
VVYMAKLAEQAERYBEMVQP 29 

121 CATGGAAGAGCTCGTTACAGGCGCTACTCC^GCGGAAGAGC 

MBQLVTGATPAEELTVEERN 49 

181 TCTCCTCTCTGTTGCTTACAAGAACGTGATCGGATCTCTACGCGCCGCCTGGAGGATCGT 

LLSVAYKNVIGSLRAAWRIV 69 

241 GTCTTCGATTGAGCAGAAGGAAGAGAGTAGGAAGAACGACGAGCACGTGTCGCTTGTCAA 

SSIEQKBESRKNDEHVSLVK 89 

301 GGATTACAGATCTAAAGTTGAGTCTGAGCTTTCTTCTGTTTGCTCTGGAATCCTTAAGCT 

DYRSKVESELSSVCSGIIiKL 109 

361 CCTTGACTCGCATCTGATCCCATCTGCTGGAGCGAGTGAGTCTAAGGTCTTTTACTTGAA 

LDSHLIPSAGASESKVFYLK 129 

421 GATGAAAGGTGATTATCATCGGTACATGGCTGAGTTTAAGTCTGGTGATGAGAGGAAAAC 

MKGDYHRYMAEFKSGDERKT 149 

481 TGCTGCTGAAGATACCATGCTCGCTTACAAAGCAGCTCAGGATATCGCAGCTGCGGATAT 

AAEDTMLAYKAAQDIAAADM 169 

541 GGCACCTACTCATCCGATAAGGCTTGGTCTGGCCCTGAATTTCTCAGTGTTCTACTATGA 

APTHPIRLGLALNFSVFYYE 189 

601 GATTCTCAAOTCTTCAGACAAAGCTTGTAACATGGCCAAACAGGCTTTTGAGGAGGCCAT 

ILNSSDKACNMAKQAFEEAI 209 

661 AGCTGAGCTTGACACTCTGGGAGAGGAATCCTACAAAGACAGCACTCTCATAATGCAGTT 

AELDTLGEESYKDSTLIMQL 229 

721 GCTGAGGGACAATTTAACCCOTTGGACCTCCGATATGCAGGAGCAGATGGACGAGGCCTG 

LRDNLTLWTSDMQEQMDEA 248 

781 AGGATCTAGATGAAGGGGGGGAGGGTTGTTACGCGATGTTTCTGCCACCAAATCGATCTC 
841 AAAAT 
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B42/AFT1 
Derivatives 

B42/1 - 248 

B42/1-121 

B42/34 - 248 

B42/122-248 

B42/34- 194 

B42 alone 

LexA/AFT1 
Derivatives 

LexA/1 - 248 

LexA/1 • 194 

LexA/1 -121 

Lex/V34 - 248 

LexA/122-248 

LexA alone 



1 248 



122 248 



34 194 

I I 



Growlh p-Galactosidase 
+ 10.9 



1 121 

I I - 1.7 

34 248 

1 =3 + 21.2 



+ 15.3 
1.8 
1.7 



J 248 

I I 

1 194 

I I 

1 121 
I I 

34 248 

I —J 

122 248 



Growth p-Galactosidase 
+ 39.2 
0.7 
0.6 

+ 9.3 

1.2 
0.8 
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hih:sa 



CO CO CO GO 
J* ^ 

5 5 5 5 S 

r- CM co ^- in 



AFT1 



Mtia JSB 



J CC CO d CO 



Q3 
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1 TCACCCAGAG AGGTCA6GCT TTGATGGACC ATGGACCCAA GAGCCGCTGA 

51 AGTTTGACAA CTCCTACTTC GTGGAACTGC TGAAAGGAGA ATCAGAGGGC 

101 TTGTTGAAAC TTCCAACTGA CAAGACCTTA TTGGAAGACC CGGAGTTCCG 

151 TCGTCTTGTT GAGCTTTATG CAAAGGATGA AGATGCATTC TTCAGAGACT 

201 ACGCGGAATC GCACAAGAAA CTCTCTGAGC TTGGTTTCAA CCCAAACTCC 

251 TCAGCAGGCA AAGCAGTTGC AGACAGCACG ATTCTGGCAC AGAGTGCGTT 

301 CGGGGTTGCA GTTGCTGCTG CGGTTGTGGC ATTTGGTTAC TTTTACGAGA 

351 TTCGGAAGAG GATGAAGTAA ACGAAATAGG AAGGAAAACA CGAAGCAACG 

401 ATGCTCTTAT TTGGGTAT-TA AAGAAACTAT TAATCGTCTA TCGAATCTAT 

451 TTTGCTGCTA CAAGATTCTA AACTCTTTGA ATCCACGATT CCACTGTTTA 

501 GTAGTAAAAA AGTTAAAAAG TCAATATTTT GGGTCCGTGA TTCATTTTTG 

551 CGATAAA 

(SEQ ID NO: 17) 



§MMMm MM 
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1 HPERSGFDGP WTQEPLKFDN SYFVELLKGE SEGLLKLPTD KTLLEDPEFR 
51 RLVELYAKDE DAFFRDYAES HKKLSELGFN PNSSAGKAVA DSTILAQSAF 
101 GVAVAAAWA FGYFYEIRKR MK* 

(SEQ !D NO: 18) 
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1 GAGTGACGAA CATTGCGTGA AATTCTTGAA GAACTGCTAC GAGTCACTTC 

51 CAGAGGATGG AAAAGTGATA TTAGCAGAGT GTATTCTTCC AGAGACACCA 

101 GACTCAAGCC TCTCAACCAA ACAAGTAGTC CATGTCGATT GCATTATGTT 

151 GGCTCACAAT CCCGGAGGCA AAGAACGAAC CGAGAAAGAG TTTGAGGCAT 

201 TAGCCAAAGC ATCAGGCTTC AAGGGCATCA AAGTTGTCTG CGACGCTTTT 

251 GGTGTTAACC TTATTGAGTT ACTCAAGAAG CTCTAAAAAC AAACAATGTT 

301 CCTATGAAGA TGATTTATAT GTAAACATTA TCTCATATCT CCTTCCACGG 

351 TTCCAAAACT ATGCTGTTTA ATAATGGTTT TTACAAGAAT TTGATTATGA 

401 GTTTGTATTT TTGTTTGTTT GGAACAAAAT TATGTGATTA TAGGGAAAAA 

451 TAAAATGAGC TATTATTGAA GAAAAAAA 

(SEQ ID NO: 19) 
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1 SDEHCVKPLK NCYESLPEDG KVILAECILP ETPDSSLSTK QWHVDCIML 
51 AHNPGGKBRT EKEFBALAKA SGPKGIKWC DAFGVNLIEL LKKL* 

(SEQ !D NO: 20) 



EP0693 554A1 

1 CCAGATTATC CCTCCCCCGA ATTCGGCACG AGGAAAAATC CTCTTCTTTC 

51 AGATGAGAAA CCCAAATCGA CGGAGGAGAA TAAGAGTTCT AAGCCGGAAT 

101 CAGCTTCTGG GAGTTCAACT TCATCAGCTA TGCCTGGCTT GAATTTCAAT 

151 GCTTTTGATT TCTCTAATAT GGCTAGTATT CTCAACGATC CTAGCATCAG 

201 AGAAATGGCT GAGCAAATAG CTAAAGATCC TGCCTTTAAC CAATTGGCTG 

251 AGCAGCTTCA GAGATCTATT CCTAACGCTG GCCAGGAAGG TGGTTTCCCT 

301 AACTTTGATC CTCAACAGTA TGTCAATACA ATGCAACAGG TTATGCATAA 

351 CCCTGAGTTT AAGACAATGG CCGAGAAACT TGGTACCGCC TTAGTTCAGG 

4 01 ATCCACAAAT GTCTCCTTTT TTGGATGCTT TCTCGAATCC TGAAACAGCA 

451 GAACACTTTA CTGAGCGTAT GGCGCGGATG AAAGAAGATC CAGAGTTGAA 

501 ACCTATACTA GATGAGATTG ATGCTGGTGG TCCTTCTGCC ATGATGAAGT 

551 ACTGGAATGA TCCAGAAGTG CTGAAAAAGC TGGGTGAAGC AATGGGTATG 

601 CCTGTTGCTG GCTTACCAGA CCAGACTGTT TCAGCTGAAC CTGAGGTAGC 

651 AGAAGAAGGT GAAGAAGAAG AGTCTATTGT TCACCAAACT GCCAGTCTTG 

701 GTGATGTTGA GGGTTTGAAA GCTGCCTTGG CATCTGGTGG TAACAAAGAT 

751 GAAGAAGATT CTGAAGGAAG GACAGCATTG CATTTTGCTT GTGGATACGG 

801 CGAGTTGAAA TGTGCTCAAG TTCTTATCGA TGCTGGAGCA AGTGTTAATG 

851 CGGTTGACAA AAACAAGAAC ACACCTCTGC ATTATGCTGC TGGTTACGGG 

901 AGGAAAGAGA GTGTAAGCCT TCTCCTGGAG AATGGTGCTG CAGTCACTCT 

951 GCAAAACCTA GACGAGAAGA CGCCAATTGA TGTAGCGAAG CTCAACAGCC 

1001 AGCTGGAGGT GGTGAAGCTG CTTGAGAAGG ATGCTTTCCT TTGAGCTCTG 

1051 CTGGTTAAAG GAAAGCTCTA AGCTCATATT GTCTTTGAGG CATTTGTCTT 

1101 GTGTGTGTCC TGAACCAGTT TCACAGGCTT TTTGTGTACA CTTTTTATTA 

1151 GTTCCTCTCT TCTTCTAAAT TTGTCTCTTA TGTTGTTTTA AAAGTCAATA 

1201 AAGAAAGAAA TAGCAATCAA TGATTTAATT TATGATTATA TTCTTTATTT 

1251 CGTCGACCTC TACAGAATGA TTCAATTTGG AAGAATCATT CTGGTTTGGA 

1301 GGATATGTAA GAAAAACTAC TTGATCTCCA AGTTATTCCA TTCTTCTGTT 

1351 GAAAAAA 

(SEQ ID NO: 21) jiiql ID 
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1 GTRKNPLLSD EKPKSTEENK SSKFBSASGS STSSAHPGLN FNAFDFSNMA 

51 SILNDPSIRE MAEQIAKDPA PNQLAEQLQR SXPNAGQBGG FPNFDPQQYV 

101 NTMQQVMHNP EFKTMAEKLG TALVQDPQMS PPLDAPSNPE TAEHFTERMA 

151 RKKEDPELKP ILDEIDAGGP SAMMKYWNDP EVLKKLGEAM GMPVAOLPDQ 

201 TVSAEPEVAB EG EE EES I VH QTASLGDVEG LKAALASGGN KDEEDSEGRT 

251 ALHFACGYGE LKCAQVLIDA GASVNAVDKN KNTPLHYAAG YGRKE SVS LL 

301 LENGAAVTLQ NLDEKTPIDV AKLNSQLEW KLLEKDAFL* 

(SEQ ID NO: 22) 
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1 TTTTAAAAAA TTTTGCCATC AACCGTAGAT GTTCCGCCAA AGGGTGGGTT 

51 TAGCTTCGAT CTGTGTAAGA GAAATGATAT TCTTACACAA AAGGGTCTTA 

101 AAGCTCCGTC TTTTTTGAAG ACTGGAACAA CCATTGTTGG TTTGATTTTC 

151 AAGGATGGTG TGATACAAGG GGCAGATACC CGAGCAACTG AGGGGCCAAT 

201 TGTTGCTGAT AAGAACTGTG AGAAGATTCA CTATATGGCA CCAAACATAT 

251 ATTGCTGTGG TGCAGGAACT CGGGCTGATA CTGAAGCAGT CACTGATATG 

301 GTCAGCTCAC AGCTGCGAOT GCATCGTTAC CAGACTGGTC GAGACTCTCG 

351 GGTCATTACT GCTTTGACCC TTCTCAAAAA ACATTTTTTC AGCTACCAAG 

401 GTCATGTCTC TGCTGCTCTT GTACTCGGTG GAGTTGATAT CACTGGTCCA 

451 CATCTGCATA CTATATACCC ACACGGTTCA ACTGACACTC TTCCATTCGC 

501 CACAATGGGT TCGGGTTCTC TTGCTGCTAT GTCTGTGTTT GAGGCAAAGT 

551 ATAAAGAAGG CCTAACTAGG GATGAAGGAA TTAAGCTGGT CGCTGAATCC 

601 ATATGCTCGG GTATATCCAA TGACCTGGGT AGTGGTAGCA ACGTGGACAT 

651 CTGCGTGATC AC A 

(SEQ ID NO: 23) 
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KILPSTVD VPPKGGFSFD LCKRNDILTQ KGLKAPSFLK TGTTIVGLIF 
KDGVIQGADT RATEGPIVAD KNCEKIHYMA PNIYCCGAGT RADTEAVTDM 
VSSQLRLHRY QTGRDSRVIT ALTLLKKHPF SYQGHVSAAL VLGGVDITGP 
HLHTIYPHGS TDTLPFATMG SGSLAAMSVF EAKYKBGLTR DEGIKLVAES 
ICSGISNDLG SGSNVDICVI T 



(SEQ ID NO: 24) 
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1 ACGAGAGGCC CTGAGACGCG GCAGATATCA GGTCCTGCGA CTTCAACACA 

51 GATCAGGAAC TTCACATTAT GTCAGCATCT GCAAGGAATC CACACACATA 

101 TCTCATCCAT GGTAGCGGAC CTTCCCAGTA TTGCTACTGA TGTATTGTCT 

151 CCTTATCTGG CTGCAATCTA TAATGCGGCA TGTGAGCCAG TTACACCTTT 

201 GTTTAAAGCA ATGCGAGACA AGCTCGAGTC ATGCATTCTT CAAATCCATG 

251 ATCAAAACTT TGGTGCTGAT GACGCTGACA TGGACAACAA CGCTTCCTCA 

301 TACATGGAGG AGTTGCAGAG ATCGATTCTT CACTTCCGCA AGGAGTTCCT 

351 ATCTAGACTA TTGCCTTCCG CAGCAAATGC TAACACTGCA GGAACAGAAT 

401 CGATCTGCAC AAGACTCACA AGACAAATGG CGTCAAGGGT TTTGATCTTC 

451 TACATCAGAC ATGCATCCCT TGTGCGACCA CTTTCAGAAT GGGGAAAACT 

501 CAGAATGGCC AAAGACATGG CCGAGCTGGA ACTAGCAGTG GGACAGAATC 

551 TATTTCCCGT GGAACAACTC GGAGCACCGT ACAGAGCTCT TAGAGCGTTT 

601 AGGCCTTTGG TTTTCCTGGA AACATCTCAA ATGGGATCAT CTCCTCTCAT 

651 CAATGATCTA CCACCGAGCA TCGTCCTACA TCATCTCTAC ACAAGAGGCC 

701 CAGACGAGTT AGAGTCACCG ATGCAGAAGA ACAGACTAAG TCCTAAACAG 

751 TACTCACTGT GGCTTGATAA CCAAAGAGAG GATCAGATCT GGAAAGGGAT 

801 AAAAGCAACT TTGGATGATT ATGCAGTGAA GATCAGATCG AGAGGGGACA 

851 AAGAGTTTAG TCCAGGTTAT CCTCTAATGC TTCAAATTGG TTCATCTTTA 

901 ACACAAGAAA ACTTATAAGC TGTGCTTTGT TACCGAATCA ATATTCTTCT 

951 ATTGCGAACT TTTTTGTCTC AAAAAA 



(SEQ ID NO: 25) 
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1 TRGPETRQIS GPATSTQIRN FTLCQHLQGI HTHISSMVAD LPSIATDVLS 

51 PYLAAIYNAA CEPVTPLFKA MRDKLBSCIL QIHDQNPGAD DADMDNNASS 

101 YMEELQRSIL HFRXEFLSRL LPSAANANTA GTESICTRLT RQMASRVLIF 

151 YIRHASLVRP LSEWGKLRMA KDMAELELAV GQNLFPVEQL GAPYRALRAF 

201 RPLVFLETSQ MGSSPLINDL PPSIVLHHLY TRGPDELESP MQKNRLSPKQ 

251 YSLWLDNQRE DQIWKGIKAT LDDYAVKIRS RGDKEFSPGY PLMLQIGSSL 

301 TQENL* 

(SEQ ID NO: 26) 

MiiaJ IS 
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